perm filename SYS4[TLK,DBL]5 blob sn#174632 filedate 1975-08-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00035 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00004 00002	.DEVICE XGP
C00006 00003	.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
C00010 00004	.NSEC(Objective)
C00020 00005	.NSECP(Why Choose Math Research?)
C00023 00006	.NSECP(Potential Applications)
C00026 00007	.NSECP(Measuring Success)
C00030 00008	.NSECP(Experimenting with AM)
C00035 00009	.NSECP(Model of Math Research)
C00047 00010	.SSEC(Textbook-Order vs Research-Order)
C00065 00011	.SSEC(Metaphors to other processes)
C00073 00012	.NSECP(Knowledge AM Starts With)
C00079 00013	.NSECP(AM's Representation of Knowledge)
C00098 00014	.NSECP(Flow of Control in AM)
C00108 00015	.SSEC(The Environment)
C00130 00016	.NSECP(Details: Getting AM going)
C00145 00017	.NSECP(Examples of Individual Modules)
C00168 00018	.NSECP(Example: The Modules Interacting)
C00172 00019	.SSEC(A Hypothetical Session)
C00179 00020	.SSEC(Comments on the Session)
C00183 00021	.SSEC(The Session Revisited: An In-Depth Example)
C00186 00022	.SSSEC(Looking at the new concept named "FACTORS")
C00200 00023	.SSSEC(Conjecturing the Unique Factorization Theorem)
C00211 00024	.SSSEC(Proving Existence)
C00222 00025	.SSEC(A Few Other Examples)
C00239 00026	.SSSEC(Considering New Compositions of Operations)
C00260 00027	.NSECP(Parameters: Size / Timetable / Results so far)
C00266 00028	.ASEC(Bibliography)
C00278 00029	.ASSEC(Books and Memos)
C00298 00030	.ASSEC(Articles)
C00307 00031	.ASEC(Appendix 1: Background for readers unfamiliar with Beings)
C00316 00032	.ASSEC(Internal Design of BEINGs)
C00326 00033	.ASSEC(BEINGs Interacting)
C00333 00034	.ASSEC(Aspects of BEINGs Systems)
C00342 00035	.COMMENT table of contents
C00343 ENDMK
C⊗;
.DEVICE XGP

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 4  "BASI30"
.FONT 5  "BDR40"
.FONT 6  "NGR25"
.FONT 7  "NGR20"
.FONT 8  "GRFX35"
.FONT A "SUP"
.FONT B "SUB"
.TURN ON "↑α↓_π[]{"
.TURN ON "⊗" FOR "%"
.TURN ON "@" FOR "%"
.PAGE FRAME 54 HIGH 89 WIDE
.TITLE AREA HEADING LINES 1 TO 2
.AREA TEXT LINES 3 TO 53
.COUNT PAGE PRINTING "1"
.TABBREAK
.ODDLEFTBORDER←EVENLEFTBORDER←850
.AT "ffi" ⊂ IF THISFONT ≤ 4 THEN "≠"  ELSE "fαfαi" ⊃;
.AT "ffl" ⊂ IF THISFONT ≤ 4 THEN "α∞" ELSE "fαfαl" ⊃;
.AT "ff"  ⊂ IF THISFONT ≤ 4 THEN "≥"  ELSE "fαf" ⊃;
.AT "fi"  ⊂ IF THISFONT ≤ 4 THEN "α≡" ELSE "fαi" ⊃;
.AT "fl"  ⊂ IF THISFONT ≤ 4 THEN "∨"  ELSE "fαl" ⊃;
.SELECT 1
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO E ⊂ APART END ⊃
.MACRO D ⊂ ONCE PREFACE 100 MILLS ⊃
.MACRO BB ⊂ BEGIN  FILL  ADJUST SINGLE SPACE PREFACE 1 INDENT 0,4,0 ⊃
.MACRO BGIV ⊂ BEGIN NOFILL SELECT 7 INDENT 0 PREFACE 0  TURN OFF "{}" TURN ON "↑↓" ⊃
.MACRO B0 ⊂ BEGIN  WIDEN 2,7 SELECT 8 NOFILL PREFACE 0 MILLS TURN OFF "↑↓α"  GROUP ⊃
.MACRO B7 ⊂ BEGIN  WIDEN 7,7 SELECT 8 NOFILL PREFACE 0 MILLS TURN OFF "↑↓α"  GROUP ⊃
.MACRO W(F) ⊂ SELECT F NOFILL SINGLE SPACE; PREFACE 0; WIDEN 7,7 ⊃
.MACRO FAD ⊂ FILL ADJUST DOUBLE SPACE PREFACE 2 ⊃
.MACRO FAS ⊂ FILL ADJUST SINGLE SPACE PREFACE 1 ⊃
.FAD

.MYFOOT←1
.FOOTSEP←"________________________________________________________________________________________"
.COUNT FOOTNOTE INLINE PRINTING "⊗A1⊗*"
.AT "$$" ENTRY "*" ⊂ XGENLINES←XGENLINES-1; NEXT FOOTNOTE; !;
.SEND FOOT ⊂ TURN ON "[]{" SELECT 7; SPACING 0; PREFACE 0; INDENT 0,10
⊗A{MYFOOT}⊗* ENTRY
.MYFOOT←MYFOOT+1
.BREAK ⊃ ⊃

.MACRO NSECP(A)  ⊂   SSECNUM←0
.SECNUM←SECNUM+1 
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO ASEC(A)  ⊂  SSECNUM←0
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1 A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_A_↓⊗*  
.  ⊃


.MACRO NSEC(A)  ⊂  
.SSECNUM←0
.SECNUM←SECNUM+1 
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.SECTION←"A"
.GROUP SKIP 3
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO SSEC(A)  ⊂  TURN ON "{∞→"   
.SSECNUM←SSECNUM+1
.SSSECNUM←0
.SEND CONTENTS ⊂
@7               A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}. A_↓⊗*  
. ⊃

.MACRO ASSEC(A)  ⊂  TURN ON "{∞→"   
.SEND CONTENTS ⊂
@7               A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_A_↓⊗*  
. ⊃

.MACRO SSSEC(A)  ⊂  TURN ON "{∞→"   
.SSSECNUM←SSSECNUM+1
.TURN OFF "{∞→"   
.ONCE INDENT 1 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}.{SSSECNUM}. A_↓⊗*  
. ⊃

.SECNUM←0
.SELECT 1
.INSERT CONTENTS
.PAGE←0
.PORTION THESIS
.NARROW 2,7
.TURN OFF "{∞→}"   
.PAGE←0
.NEXT PAGE
.INDENT 0
.FAD
.TURN OFF "{∞→}"   
.ONCE CENTER
.NSEC(Objective)

.EVERY HEADING(⊗7Automated Math Theory Formation,Doug Lenat,{DATE}     page  {PAGE}⊗*)

Throughout all of science, one of the most important issues is that of
theory formation: 
how to recognize potentially related concepts, how to tie such concepts
together productively,
how to use intuition, how to choose, when to give up and try another
approach,
how to extend, when to define, what to examine, what to ignore,...
These questions are difficult to answer precisely, even in a
single domain.  
For my dissertation, I am investigating
creative theory formation in mathematics: 
how math researchers propose interesting yet plausible
hypotheses, test them, and develop them into interesting theories.$$A mathematical
theory encompasses (i) a basis of primitive objects and activities, (ii) a
foundation of axiomatic constraints on the basis, (iii) a body of definitions
tying together basic and already-defined concepts, (iv) a body of theorems which
are implied by the foundation and the definitions,(v) some interpretations in
terms of either reality or  other theories.*

The experimental ⊗4vehicle⊗* of my research is
a system, a computer program called AM
(for ⊗2↓_A_↓⊗*utomated ⊗2↓_M_↓⊗*athematician),
which can actually do mathematical research:
propose axiomatizations,$$ Based on AM's earlier successes and/or on simulated
real-world situations *  
propose and prove conjectures,$$ Using the
heuristics discussed by Polya (e.g., analogy, inductive inference from empirical
evidence, planning in an abstract space, double induction) *
use intuition$$ Intuition
will mean 
simulated real-world scenarios. These are predominantly visual analogies
of current system problems. From them, the system can extract "answers"
which, while not justifiable, are quite often correct. *
to justify beliefs and to
understand new observations, evaluate concepts and theories for aesthetic$$
Aesthetics has components of harmony, unity, elegance, simplicity, closure, etc.*
interestingness$$ Interestingness will be evaluated locally; e.g.,
the concept named "COMPOSITION" knows several features which indicate when any given
composition is interesting; one of these is "if the domain and range of the final
composition are equal sets,
but such domain/range equality does not hold true
for either of the two constituent relations of
the composition".*,
and use such judgments to guide development in the most promising direction.

More specifically, AM will be given 150 particular ⊗4premathematical⊗*$$
Universal, fundamental, prenumerical knowledge. This includes:
(i) mathematics: elementary notions of sets, relations, properties, 
(ii) methods: problems, problem-solving,
proof techniques, analogy, programming, strategies, communication,
(iii) opaque abilities: judgmental criteria
to evaluate interestingness, aesthetics, utility, etc; locate relevant knowledge,
manipulate abstract simulated "intuition" scenarios. *
concepts,$$
What do we mean by a "mathematical concept"?  A few moments of pondering this
should convince the reader that this cannot be answered cleanly.
Later, we shall indicate the classes of concepts involved; for now, let us just
mention a few specific ones to indicate the breadth involved. Each of the
following is a mathematical concept:
Set, Relation, α{1, frob, α{α}α}, Zorn's Lemma, Theory, Union, Prove, Proof, Theorem,
The Unique Factorization Theorem (UFT), The proof of UFT, The methods used to
prove UFT, Constructively proving existence, Associativity.
A circular definition of a mathematical concept might be
"all the things discussed in math books";
an operational definition might be "whatever AM knows about initially, plus
whatever new knowledge it acquires." * 
and will 
then study their facets, occasionally deciding that some facet is worth
naming and storing as a new concept. One of the concepts not fed to AM, but which
AM will hopefully develop soon, is
the notion of Cardinality, which paves the way to the domain
of (elementary) number theory.
Eventually, AM may develop enough mathematical systems to
notice a common structure, which will be abstracted into something like the
group axioms. This might then lead into developing elementary abstract algebra.

AM is almost completely specified on paper$$
Each of the 150 concepts to be initially supplied has about one page of
information; all of this is found in the file GIVEN[AM,AJC] at SAIL.*.
The control structure for the system is programmed, and a few concepts have
already been introduced. Some nontrivial behavior should appear when a few
dozen more are added. In a few  months, when all 150 are present,
AM will be  debugged and one may start experimenting
with it.

Such a proposal immediately raises many questions; they are listed below, and
each one is then dealt with in a separate section.

.BEGIN NOFILL INDENT 3 PREFACE 0 SELECT 2 GROUP; ONCE PREFACE 2

Why choose mathematics research as the domain for investigating theory formation?
What are some potential concrete applications of this project?
How will the success of the system be measured?
What experiments can be done on AM?
What is our model of math research? ⊗7To automate something, you must have a good model for it.⊗*
What given knowledge will AM initially start with?
How will this knowledge be represented?
What is the control mechanism; what does AM "do"?
Examples of individual knowledge modules. ⊗7One module for each concept.⊗*
Examples of communities of modules interacting, developing new modules.
What is the timetable for this project?   What is the current state?
.END

.EVERY HEADING(⊗7Automated Math Theory Formation,Doug Lenat,{DATE}     page  {PAGE}⊗*)

.NSECP(Why Choose Math Research?)

As the DENDRAL project illustrated so clearly$$ see Buchanan,
Feigenbaum, et. al.  Choice of subject was enabled by J. Lederberg in 1965.*,
choice of subject domain is
quite important when studying how researchers discover and develop their theories.
Mathematics has been chosen as the domain of this investigation, because

(i) In doing math research, one needn't cope with
the uncertainties and fallability of testing equipment;
that is, there are no uncertainties in the data (compared to, e.g., chemical
identification from mass spectrograms).

(ii) Reliance on experts' introspections is one of the most powerful techniques
for codifying the judgmental criteria necessary to do effective work in a field;
I am enough  of an expert in 
elementary mathematics so that I
won't have to rely on external
sources for guidance in formulating such heuristic rules.

(iii) A hard science is of course easier to work in than a soft one; to
automate research in psychology would require more knowledge about 
human information processing than now is known, because psychology deals with
entities as complex as you and I. Also, in a hard science, the ⊗4languages⊗* to
communicate information can be simple even though the 
⊗4messages⊗* themselves be sophisticated.

(iv) 
Since mathematics can deal with any conceivable constructs, a researcher there is
not
limited to  explaining observed data.
This reason is much less influential in practice than would be expected:
It turns out that most unmotivated forays are fruitless.
Most of the successful, interesting mathematical theories are inspired by reality$$
Or by other math theories which are well-established already. It is too philosophical
an issue to argue whether or not such theories constitute part of reality, so I
merely note them as a possible exeception.*.
.NSECP(Potential Applications)

(i) The modular representation of knowledge that AM uses might prove to be effective
in other large knowledge-based systems.

(ii) Most of the particular heuristic judgmental criteria for interestingness,
utility, etc., might be valid in developing theories in other sciences.

(iii) If the repertoire of intuition (simulated real-world scenarios)
is sufficient for AM to develop elementary concepts of math, then educators
should ensure that children (4-6 years old) are thoroughly exposed to those
scenarios.$$ These tentatively include situations like seesaws, slides,
piling marbles into pans of a balance scale, comparing the heights of towers
built out of cubical blocks, solving a jigsaw puzzle, etc.*

(iv) We might learn something about how the theory formation task changes
as the theory grows in sophistication. For example, can the same methods which
lead AM from premathematical concepts to arithmetic also lead AM from
there to abstract algebra? Or are a new set of intuitions or criteria required?

(v) An unanticipated but possible result would be the proofs of -- or at least the
statements of -- interesting new theorems or even whole theories.
This might also take the form of a redivision of existing concepts, an alternate
formulation of some established theory, etc.
.NSECP(Measuring Success)

(i) By AM's achievements: Compare the list of concepts and methods it develops
against the list of concepts and methods it began with.
Did AM ever discover anything interesting yet unknown to the user?$$ 
The "user" is a human works with AM interactively, giving it hints, commands,
questions, etc.
Notice that by "new" we mean new to the user, not new to Mankind. 
This might occur if the user were a child, and AM discovered
some elementary facts of arithmetic.
This is not really
so provincial:  mathematicians take "new" to mean new to Mankind, not
new in the Universe.  I feel philosophy slipping in, so this footnote is
terminated.*

(ii) By the route AM took to accomplish these advances:  How clever, how circuitous,
how many of the detours were quickly identified as such and abandoned?

(iii) By the character of the User--System interactions: How important is the user's
guidance? How closely must he guide AM? What happens if he doesn't say anything ever?
When he does want to say something, is there an easy way to express that to AM,
and does AM respond well to it?
Given a reasonable kick in the right direction, can AM develop the mini-theories
which the user intended, or at least something equally interesting?

(iv) By its intuitive powers: when forced to make a snap judgment on some issue,
which side does AM choose, and why?  Are the conjectures it tries to prove usually
true?$$ In fact, I hope AM tries to prove an intuitively clear yet false
proposition.*  How accurately does AM estimate the difficulty of tasks it
is considering?  How big a help is the intuitive belief in a conjecture
when trying to formulate a constructive proof of that conjecture?
Does AM tie together (e.g., as analogous) concepts which are formally unrelated
yet which benefit from such a tie?

(v) By the ability to perform the experiments outlined in the next section.
Regardless of the experiments' outcomes, 
the features of AM which allow them to be carried
out at all would be interesting in themselves.
.NSECP(Experimenting with AM)

Assume that AM has been written, and has in fact developed some new$$
New to AM, not to the world. It would be quite accidental if AM proved 
theorems unknown to Mankind.*
concepts on its own. 
Here is a list of some of the more interesting experiments that would be
possible to perform, using AM:

(i) Remove individual concept modules. Is performance noticably degraded?$$
AM should operate even if most of its criteria have been
"turned off"
and most of its modules excised,
so long as ↓_any_↓ parts of any of the modules remain
enabled. 
If the remaining fragment of AM is too small, however, it may not be
able to find anything interesting to do. In fact, this situation has
been encountered experimentally, when the first few partially complete
modules were inserted. If only some abilities, criteria are turned off,
AM may in fact keep operating without this "uninteresting collapse".
For example, if all but
the formal manipulation knowledge is removed, the
system should still grind out
(simple) proofs. If all but the analogy and intuition criteria are excised,
some plausible (but uncertain) conjectures should still be produced and
built upon. 
If these forces are buried deep in an Environment, they should be tunable
(by the creators) to almost negligibility, so the same experiments can still be
carried out.
The converse situation should also hold: although still functional with any module
unplugged, the performance ↓_should_↓ be noticably degraded. 
That is, while not indispensible, each module should nontrivially help the others.
For example,
the job of proving an assertion
should be made much easier by the presence of intuitive understanding. If
a constructive proof is available, the necessary materials will already be
sketched out for the formal methods to build upon. *
Which concepts does AM now "miss" discovering? Is the removed concept
later discovered anyway by those which are left in AM?  This should indicate
the importance of each kind of concept (and method) which AM starts with.

(ii) Vary the relative weights given to features by the criteria which judge
aesthetics, interestingness, worth, utility, etc.  See how important each factor
is in directing AM along successful routes.

(iii) Add several new concept modules (and a few new intuitions) and see if AM
can work in some unanticipated field of mathematics (like 
graph theory or calculus).

(iv) Add several new intuitions, and see if AM can develop nonmathematical
theories (elementary physics, program verification). This would also require
limiting AM's freedom to "ignore a given body of data and move on to something
more interesting".

.NSECP(Model of Math Research)

In order to intelligently discuss how to automate an activity, we must be very
clear about how humans do that activity. Thus, for AM, we must begin by hypothesizing
a particular model of how mathematicians actually go about doing their research.
After presenting our model, we can then discuss how to automate the processes
involved.
Thanks to Polya, Kersher, Hadamard, Skemp, many others, and introspection,
I have pieced together a tentative such information processing
model for math theory formation:

.BEGIN INDENT 0,3,0 
.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"
⊂∞α→⊃
.NARROW 2,2  SINGLE SPACE PREFACE 0

1. The order of events in a typical mathematical investigation is:
(a) OBSERVE:
The observation is either of reality or of an analogous, already-established
mathematical theory.
(b) NOTICE REGULARITY: Perceive some patterns, some interesting relationships
that appear to hold sometimes.
Thus math research is an ⊗4empirical⊗* process.
(c) FORMALIZE: Decide on some of the objects, operators,
definitions, and statements that the theory will contain.
(d) FINALIZE: Decide which concepts are to be primitve and which aren't;
decide which statements will be considered axioms, and ensure that the others
can in fact be derived from them.
(e) DEVELOP: What additional theorems can be proven from this formal system;
do they correspond to observable phenomena in the domain which motivated
this new theory?  When new observations are made in that motivating domain,
can they be naturally phrased as formal statements in this theory; and if so,
are they provable from the existing axioms and theorems?


2. 
Notice that each step in (1) involves choosing from a large set of alternatives
-- that is, searching.
The key to keeping this from becoming a blind, explosive search is the
proper use of evaluation criteria. That is, one must constantly choose the
most interesting, aesthetically pleasing, useful,... alternative available.
This is analogous to Dendral's reliance on good heuristics to constrain the
structure generator.

3. 
But many of those criteria are usually opposed to each other 
(e.g., one often must sacrifice
elegance for utility, interestingness for safety, etc.). How should one
weight these features when deciding what to do next  during an investigation?
We believe (and one goal of AM is to test) that
the non-formal criteria (aesthetics, interestingness, inductive inference (from
empirical evidence), analogy, intuitive clarity, utility)
are much more important than formal
deductive methods in developing mathematically worthwhile theories, and
in avoiding barren diversions.
Among the subjective criteria, the  order listed above is roughly their order
of importance. However, AM should have a dynamically variable "orientation",
which under certain circumstances might induce it to seek safety (e.g., utility)
rather than uncertainty (e.g., proposing an analogous proposition).

4. The above criteria are virtually the same in all domains of mathematics,
and at all levels. Each factor encourages some pursuits and discourages others.
It is hoped that no modifications need be made to AM's judgmental scheme,
as AM acquires more and more new concepts.

5. For true understanding, AM should grasp$$  Have access to, relate to, store,
be able to manipulate, be able to answer questions about *
each concept in several ways:
declarative (definition), operational (how to use it), demonic (recognizing
when it is relevant), exemplary (especially boundary examples), and
intuitive (simulated image of a real-world interpretation).

6. Progress in ⊗4any⊗* field of mathematics demands much intuition (and some
formal knowledge) of ⊗4many⊗* different "nearby" 
mathematical fields. So a broad,
universal core of intuition must be established before any single theory
can meaningfully be developed.
Intuition is contrasted with more formal representations by the fact that it
is opaque (AM cannot introspect to determine how the result is produced) and
fallable.

.WIDEN 2,2
.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"
%∞α→$
.END

.SSEC(Details)

This last point (6) merits elaboration. I believe that to develop any given field
of mathematics, one must possess much intuition about each
⊗4psychologically⊗* prerequisite field,$$ One which makes the new field
easier to learn, which contains more concrete analogues of the ideas of the new
field. For example, knowing about geometry makes it easier to learn about
topology, even though topology never formally uses any results or definitions
from geometry. * and some definite facts  about each ⊗4formally⊗* preceding 
field$$ For example, arithmetic is usually formally based upon set theory,
although most of us learn about them in the other order *  of mathematics.  
The diagram here
indicates the prerequisites for each domain which might
conceivably be worked in by AM.   To start in a given domain, some knowledge
should exist about all domains which point to the given one,
about all domains which point to ⊗4them⊗*, etc.

.GROUP SKIP 3

⊗4NOTE: This section, and especially the one after it, are
supplementary, and may be skipped on first reading.⊗*

.B0


Elementary Logic  ααα→  Theorem-Proving  ααααααααααααααα⊃
    ↑							~
    ~							~
    ~							~
    εαααααααα→  Geometry  ααα→  Topology		~
    ~		    ~		↑      ~		~
    ~		    ~		~      ~		~
    ~		    ↓	        ~      ↓ 		~
    ~      Analytic Geometry    ~   Algebraic Topology	~
    ~		 ↑              ↓      ↑		~
    ~            ~ Measure Theory      ~		~
    ↓	         ~ ↑ 		       ~		~
Boolean Algebra αβαβαα→  Abstract Algebra 		~
    ↑            ~ ~      ~				↓
.ONCE TURN ON "α"
    ~	         ~ ↓      ~		  Program Verification
    ~	    Analysis      ↓				↑
    ~		   ↑     Concrete Algebra		~
    ~              ~      ↑				~
    ~		   ~      ~				~
    ↓		   ~      ~				~
Set Theory  ααα→  Arithmetic  ααα→  Number Theory	~
		      ~					~
		      ~					~
		      ↓					~
		Combinatorics  ←ααα→  Graph Theory  αααα$
.E


Each arrow represents either a psychological or a formal prerequisite:
To work in the field pointed ↓_to_↓, one should know something about the field
pointed ↓_from_↓.
Notice that almost all the "elementary" 
branches of mathematics are either formally
or psychologically prerequisite 
to each other.$$ For example, one should have some intuition about 
sets before doing Number theory,
and one should have some intuition about numbers 
and counting before doing Set theory. * So a broad foundation of intuition, 
spanning ⊗4↓_several_↓⊗* mathematical and 
real-world concepts, is prerequisite to sophisticated behavior in ⊗4any⊗*
branch of mathematics.
This smacks of  the idea of a "critical mass" of knowledge,
so often sensationalized in science fiction. In addition to 
expecting that the corpus must exceed some minimum absolute size,
I am claiming that it must also exceed some minimum degree of richness, of breadth.

.SSEC(Textbook-Order vs Research-Order)

Let us elaborate on point (1) of the model. The idea there is that
the order in which a math textbook presents a theory is almost the exact
opposite of the order in which it was actually discovered and developed.

.B0  INDENT 10

PRIMITIVES, AXIOMS ααααα→ DEFINITIONS    ←αααααααααααααα⊃
				~			~
				~			~
				↓			~
			STATEMENT OF LEMMA 		~
				~			~
				~			~
				↓			~
			  PROOF OF LEMMA		~
				~			~
				~			~
				↓			~
				~			~
				~			~
				↓			~
		     STATEMENT OF NEXT LEMMA		~
⊂αααααααα⊃ 			~			~
~textbook~      		|			~
~ order  ~			|			~
%αααααααα$			|			~
				|			~
				|			~
				↓			~
		       PROOF OF LAST LEMMA		~
				~			~
				~			~
				↓			~
		      STATEMENT OF THEOREM		~
				~			~
				~			~
				↓			~
			PROOF OF THEOREM ααααααααααααααα$

.E

In a textbook, one introduces some primitive elements of the theory, postulates
some axioms relating these objects and operations, then enters a 
⊗4"Define → State → Prove⊗*" loop. Lemmas are stated and proved before the
theorems which require them. Motivations of any kind are a rareity, and often
are mentioned parenthetically as an application of a theorem which has just been
proved.  
As an example, here is how my sophomore textbook was organized:

.B0
.INDENT 10
.TURN ON "α"

Set N, Op S:N→N, 
Peano⊗1'⊗*s Axioms   αααααααααααα→   Define +                      
				~			
				↓			
		    State some property of +   		
				~			
				↓			
		      Prove this propterty		
				|			
				|		
				|			
		           Define ⊗6≤⊗*
				|			
				|			
				|			
		           Define g.l.b
				~			
				~			
				↓			
		   State that glb always exists	
				~			
				↓			
			Prove this claim
				|
				|
				|
			  Define  TIMES
			  Define ⊗5Z⊗*
				|
				|
				|
			  Define  DIVIDES
			  Define  PRIME
				~
				↓
		     State Euclid⊗1'⊗*s Algorithm
				~
				↓
		     Prove Euclid⊗1'⊗*s Algorithm
.E

It began by presenting Peano's axioms, completely unmotivated, then entered
the the Define → State → Prove loop.  For example, the two concepts of
⊗4prime⊗* and ⊗4non-factorizable number⊗* are defined separately,
even though a hundred pages elapse before any mathematical system is presented
in which the two notions don't completely coincide.

This psychic ordering is repeated, 
in microcosm, in each textbook proof. For example,
a typical epsilon/delta argument will start by magically proposing some function
delta(epsilon), which is just the right one for the approximations taken later.

In contrast, 
a mathematician doing research works in almost the opposite order.

.B0 INDENT 0

OBSERVE x  ααααα→  NOTICE SOME INTERESTING REGULARITIES  
				 ~
				 ~
				 ↓
	     GATHER UP ALL RELATED INTERESTING RELATIONSHIPS
				 ~
				 ~					⊂αααααααα⊃
				 ↓					~research~
		        STATE THESE FORMALLY				~ order  ~
				 ~					%αααααααα$
				 ~
				 ↓
		CHOOSE THE MOST INTUITIVELY OBVIOUS OF THESE
				 ~
				 ~
				 ↓
		 TRY TO DERIVE ALL THE REST FROM THIS CORE
				 ~
				 ~
				 ↓
       IF THE CORE IS INCONSISTENT, ELIMINATE THE LEAST BELIEVED MEMBERS
       IF ANY CAN⊗6'⊗*T BE DERIVED: ENLARGE THE CORE SO THAT THEY CAN BE
       IF ANY MEMBER OF THE CORE CAN BE DERIVED FROM THE OTHERS, ELIMINATE IT
				 ~
				 ~
				 ↓
	   CALL THE CORE "AXIOMS", AND THE REST "THEOREMS"
				 ~
				 ~
				 ↓
       TRY TO DERIVE PREVIOUSLY UNSUSPECTED THEOREMS, FORMALLY
		IF SO: TRY TO REINTERPRET IN TERMS OF x
       TRY TO FIND NEW RELATIONSHIPS OCCURRING IN x
		IF SO: TRY TO PROVE THEM USING THE EXISTING AXIOMS AND THEOREMS
.E

The researcher begins by observing the world: he looks at scenarios from reality
and perhaps also some earlier, already-developed, interesting mathematical theories.
He notices some interesting facts, either by raw perception of regularities
or by analogy to some interesting pattern in another theory. Of these observations,
he chooses a small set of the most intuitively clear ones, and tries to derive the
other observations formally from this chosen core. After several passes, he will
succeed; the final chosen core he calls axioms, and the rest he calls theorems.
$$ Once axiomatized, new theorems may be proposed syntactically: the researcher then
may check each conjecture against the roots of the theory (both real-world and
analogous). Additional observations can later be checked against the theory,
to see if they also can be derived from the axioms.*

Let's make this more concrete, by considering how a mathematician might discover
and develop the simplest numerical concepts, the ones my textbook introduced
and developed in such a polished way.

Assume that he possesses our previously-mentioned prenumerical background, but
nothing more.
He begins by considering the two activities:
Comparing piles of marbles using a pan balance, and comparing the heights of
towers constructed from toy blocks.

.SKIP TO COLUMN 1
.B0 INDENT 7

	       /~\				       /~\
	      /	~ \				      /	~ \
	     /	~  \				     /	~  \
	    /	~   \				    /	~   \
	   /    ~    \				   /    ~    \
	  /	~     \				  /	~     \
      	 /	~      \    		     oo	 /	~      \  oo
    αααα/	~	\αααα		    αααα/	~	\αααα
		~					~
		~					~
		~					~
    αααααααααααα∀αααααααααααα		    αααααααααααα∀αααααααααααα

			       /~\
			      /	~ \
			     /	~  \
		 	    /	~   \   o
			   /    ~    \αααα
			  /	~     			⊂αααααααα⊃
		      	 /	~      			~weighing~
		        /	~			~ marbles~
		       /	~			%αααααααα$
		 ooo  /		~
		 αααα/		~
				~
		    ααααααααααααααααααααααααα

*********************************************************************************

	  αααααα⊃
	        |
	⊂ααααα⊃ |
	~     ~ |
	~     ~ |					⊂αααααα⊃
	εαααααλ |					~piling~
	~     ~ |		  αααααα⊃		~blocks~
	~     ~ |		  	|		%αααααα$
	εαααααλ |		⊂ααααα⊃ |
	~     ~ |		~     ~ |
	~     ~  		~     ~ 
	%ααααα$     		%ααααα$
.E
.SKIP TO COLUMN 1

When he tires of playing, the mathematician makes a list of the most basic
features of these situations. 


.BEGIN NOFILL SELECT 1 PREFACE 0 INDENT 0 TABS 44,85 TURN ON "\" GROUP
.TURN ON "→"
.ONCE CENTER PREFACE 2
⊗5FUNDAMENTAL CONCEPTS⊗*

ABOUT WEIGHING MARBLES:\ABOUT PILING BLOCKS: →COMMON NAME:

There are some marbles to play with.\There are some blocks to play with.→ele

The marbles are indistinguishable.\The blocks are indistinguishable.→set

Besides individual marbles, we can\Besides individual blocks, we can→var n
talk about a pile of marbles as a unit.\talk about a tower of blocks as unit.

We can copy a given pile.\We can copy a given tower.→copy C

A pile of marbes and a copy of\A tower of blocks and a copy of
that pile are indistinguishable.\that tower are indistinguishable.→ignore C

To make pile heavier, add a marble.\We can stack another block onto a tower.→succ S

Lighten pile by removing a marble.\Can shorten tower by removing top block.→pred P

We can tell if piles x,y balance.\Can see if towers x,y are same size.→equal =

Can tell if pile x is heavier than y.\We can see if tower x is higher than y.→⊗6>⊗*

We can add one pile to another.\We can stack one tower on top of another.→add +

Can remove a pile from a big one.\Can lift a tower off top of a big one.→sub -

Can remove any marble from pile.\We can only remove topmost blocks.→pop

Pile might have no marbles.\There might be no blocks in a tower.→zero 0

Can replace each marble in a pile\We can replace each block in a tower
by a copy of some other pile.\by a copy of some other tower.→mult x

A lone marble is also a pile.\A lone block is also a tower.→one 1

etc.\etc.→etc.

***********************************************************************************
.E

These correspond to the objects and operators of the theory he is about to develop.
He gives them each a name; I have permitted him to choose the common name for
each concept, but that doesn't matter, of course. Notice that most of these
are not usually considered primitive at all. 

The next list he draws up contains several specific observations, translated from
the blocks/marbles manipulation language into these new symbols.
Each relationship is translated from these symbols into the
⊗4other⊗* scenario's language, and checked. Thus all the observations here are
meaningful in both of the manipulative domains:

.BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP

************************************************************************************************

0=0\1=1\¬(0=1)\¬(1=0)\¬(S(1)=1)\¬(0=S(0))

S(0)=1\1=S(0)\S(0)=S(0)\S(1)=S(1)\¬(S(0)=S(1))\¬(S(1)=S(0))

0+0=0\1+1=S(1)\0+1=1\1+0=1\S(1)+0=S(1)\S(0)+S(0)=S(1)

0x0=0\1x1=1\0x1=0\1x0=0\S(S(1))x0=0\S(S(1)x1=S(S(1))

1>0\S(1)>1\S(1)>0\¬(0>1)\¬(1>1)\¬(0>0)

************************************************************************************************

.E


Now comes some simple generalization, either directly from the scenarios,
perceptually from the list just compiled, or syntactically from that list
(e.g., by replacing constants by variables). 
Again,  each of these generalizations
is intuitively checked in both real-world domains.

.BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP

*************************************************************************************************

If n>0, then  D(n) can be performed; else D(n) cannot be performed.

If n>0, then n>D(n)\S(n)>n\\S(n)=n+1\S(S(n))>n

If n=m then S(n)=S(m)\\\If S(n)=S(m) then n=m

D(S(n))=n\\If n>0 then S(D(n))=n\If n>0 then n+m>m

m+S(n)>m\\If n=m then ¬(n>m)\If n>m then ¬(n=m)

If n=m ∧ n>r then m>r\n=n\¬(n>n)\¬(n=S(n))

If n>m ∧ m>r then n>r\If n=m ∧ m=r then n=r\If n=m ∧ r=m then r=n

If n=m then m=n\If n>m then ¬(m>n)\n>m ∨ m>n ∨ n=m

nxm = mxn\\nxS(m)=nxm + n\\n+m=m+n

n+S(m)=S(n)+m\\(n+m)+(r+s) = n+((s+m)+r)

nx(mxr)=(nxr)xm\S(n)+D(m)=m+n

etc.

*************************************************************************************************

.E

Some of these are now considered as axiomatic, as defining the operations and
structures involved; the rest are considered secondary, as theorems following from
the axioms. This process is a slow search.
The most important game is not "finding minimal axiom sets", but rather
"what else can we prove from this set of axioms and theorems".
The AM research goal is to learn how to play this game of theory development.

To summarize idea (1):
The ⊗4motivation⊗* for a new mathematical development is either:
(a) looking at reality and at the mathematics
you have already$$ Then abstract out some interesting features; 
formalize those into a
specific set of primitive structures and relations  and axioms about those
basic entities *, or (b) given an existing theory, propose some new conjecture or
definition, based on analogy to the real-world roots of this theory,$$ Based
on patterns perceived in empirical
evidence *  or based on analogy to a known interesting theorem in another theory,
or occasionally based only on applying some fruitful method and observing the
result.

.SSEC(Metaphors to other processes)

Let's consider some alternate ways of looking at such theory development, some
analogies to processes with which you're probably more familiar.


.SSSEC(Growing a tree -- using heuristic constraints)

Once motivated, the idea is developed and evaluated. At each moment, the
researcher should be working on the most promising of all the proposed ideas.
This process is thus a sophisticated expansion of a tree,
where new conceptual nodes are
"grown" in the most promising area, and where barren dead-ends are eventually
"pruned". To do mathematical research well, it is thus necessary and sufficent
to have good methods for proposing new concepts from existing ones, and for
deciding how interesting each candidate is. That is: effective growing and pruning
strategies.

.SSSEC(Exploring a space using an evaluation function)

Consider our core of premathematical knowledge.
By compounding the known concepts and methods,$$Using 
abstraction from reality, analogy with existing theories,
the postulational method, and problem-solving techniques *
we can extend this
foundation a little wherever we wish.
⊗4<Visualize: SEVERAL SHORT LINES IN BLACK, EMANATING
FROM A CENTRAL CORE>⊗*.
The incredible variety of alternatives to investigate includes
all known mathematics, much trivia, countless deadends, and so on.
The only "successful" paths near the core 
are the narrow ribbons of known mathematics 
(perhaps with a few undiscovered other slivers). 
⊗4<Visualize SNAKE-LIKE LINES IN RED, twisting away from the core, intersecting>⊗*.

How should we walk through this immense space, with any hope of following the
few, slender branches of already-established mathematics (or some equally successful
new fields)? We must do hill-climbing; as new concepts are formed, decide how 
promising they are, always explore the currently most-promising new concept. The
evaluation function is quite nontrivial, and this research may be viewed as an
attempt to study and explain and duplicate the judgmental criteria people employ.
My attempts at codifying such "mysterious" emotive forces as intuition, aesthetics,
utility, richness, interestingness, relevance... indicate that a large but not
unmanagable collection of heuristic rules should suffice.

The important visualization to make 
is that with proper evaluation criteria, we convert the
flat picture  to
a breath-taking relief map:
the known lines of development become
mountain ranges, soaring above the vast flat plains 
of trivia and inconsistency
below.
Occasionally an isolated
hill is discovered near the core;$$ E.g.,
Knuth's ↓_Surreal Numbers_↓ * certainly
whole ranges lie undiscovered for long periods of time:$$ E.g.,
non-Euclidean geometries * the terrrain far from the initial core is
not at all explored.  

Intuition is like vision, letting the explorer observe a distant mountain,
long before he has conquered its intricate, specific challenges.

If the criteria for evaluating interestingness and promise are good enough, then
it should be straightforward to simply push off in any direction, locate some
nearby peak, and follow the mountain range along (duplicating the development in
some field). In fact, by intentionally pushing off in apparently barren directions,
new ranges might be enountered. If the criteria are "correct", then ⊗4any⊗* new
discovery the system makes and likes will necessarily be interesting to humans.

If, as is more likely, the criteria are deficient, this too will tell us much. Before
beginning, we shall strive to include all the obvious factors which enter into
judgmental decisions, with appropriate weights, etc. If the criteria fail, then
we can analyze that failure and learn about one nonobvious factor in evaluating
success in mathematics (and in any creative endeavor). After modifying the criteria
to include this new factor, we can proceed again.

.SSSEC(Syntax and Semantics in Natural Language Processing)

One final analogy: Consider assumptions, axioms, definitions, and
theorems to be 
syntactic rules for the
language that we call Mathematics. 
Thus theorem-proving, and the whole of textbook mathematics, is a purely syntactic
process.
Then these judgmental
criteria I am alluding to would correspond to the semantic knowledge associated
with these more formal methods.
Just as one can upgrade natural-language-understanding by encorporating semantic
knowledge, so AM ensures that it knows the intuitive and empirical roots of
each concept in its repertoire.

.NSECP(Knowledge AM Starts With)

At each stage, AM has some set of concepts (represented as modules,
clumps of information). AM's basic activity is to fill out the existing clumps
more, and develop new clumps. Each clump is called a BEING. Here are
the names of the BEINGs (concepts, clumps, ... ) which AM will start with:


.BEGIN FILL SELECT 6 SINGLE SPACE PREFACE 0 TURN OFF "{}-"
.INDENT 3,7,0

.ONCE PREFACE 3 INDENT 0
(i) ⊗2Objects⊗*

Ordered-pair,
Variable,
Propositional-constant,
Structure,
List-structure,
Bag-structure,
Set-structure,
Assertion,
Axiom

.ONCE PREFACE 2 INDENT 0
(ii) ⊗2Actives⊗*


Operations:  Compose, Insert, Delete, Convert-structure, Substitute, Assign,
Map-structure, 
Undo,
Reverse-ordered-pair, Rule-of-inference, Disjoin, Conjoin, Negate,
Imply, Unite, Set-union, Cross-product, Common-parts, Set-intersection,
Set-difference, Put-in-order, Abstract-out-similarities, Contrast

Relations: Equality, Membership, Containment, Equivalence, 
Equipollence,
Scope, 
Quantification, ⊗7For-all, There-exists, Not-for-all, Never, Only-sometimes⊗*

Properties: Ordered, Extreme, Properties-of-Activities

.ONCE PREFACE 2 INDENT 0
(iii) ⊗2Higher-order Actives⊗*

Find, Select, Guess, Ellipsis, Analogize, Conserve, Approximate

Examine, Test, Assume, Judge,
Define, 
Prove, ⊗7Logically-deduce, Prove-directly, Cases,
Working-backwards, Prove-indirectly, Prove-universal-claims, Mathematical-induction,
Prove-existential-claims, 
Prove-existence-constructively, Prove-existence-deductively,⊗*
Disprove, ⊗7Disprove-constructively, Disprove-indirectly,⊗*

Solve-problem, Debug, Trial-and-error, Hill-climb, Subgoal, Work-indirectly,
Relations-between-problems

Communicate, Communicate-with-user, Translate-into-English-for-User,
Translate-from-English-for-concepts, User-model, Communicate-among-concepts,
Infer-from-examples, Select-representation

Isomorphism, Categoricity, Canonical, Interpretation-of-theory

.ONCE PREFACE 2 INDENT 0
(iv) ⊗2Higher-order Objects⊗*

Statement, Conjecture, ⊗7Statement-of-generality, Statement-of-existence,⊗*
Theorem, Proof, Counterexample, Contradiction, Analogy, Assumption

Problem, Problem-to-find, Problem-to-prove, Problem-to-creatively-define,
Action-problem, Bug, Inference-problem

Representation, Symbolic-representation, Diagram, Scenario

Mathematical-theory, Axiomatic-Foundation, Primitive-Basis, Formal-system

.END
.ONCE PREFACE 3

The more basic the initial core concepts, the more chance there is that the 
research will go off in directions different from humans, the more
chance it will be a waste of time, and the more valid the test of the search-pruning
forces.  Thus we won't even provide AM with the "simple" concepts of function,
inverse, commutativity, counting, etc. We do hope that AM discovers them, or
some equally interesting "low-level" building blocks.

Of course, we must ask ⊗4precisely⊗* what AM knows about each of these concepts.
But to fully answer that, we must digress and discuss how the knowledge
is represented. We must explain the modular BEINGs scheme for representing
information about a concept. This is treated curtly in the next section; a longer,
smoother introduction to BEINGs is found in Appendix 1.

.NSECP(AM's Representation of Knowledge)

AM will represent each concept in its repertoire
as a bundle of facets ⊗4(Definitions, Intuitions, Recognition, Usefulness,
Interestingness, Properties,...)⊗*, 
and each facet will be stored internally as a
little program.  Each concept will have precisely the same set of
25 facets.  This enables us to assemble, in advance, a body of knowledge
(called ⊗4strategic⊗* knowledge) about each facet. Each concept is called
a BEING$$
For historical reasons. BEINGs is a modular scheme for representing
knowledge, akin to ACTORs, 
PANDEMONIUM, SMALLTALK,
CLASSes, and FRAMEs. Details
about the origin of the BEINGs ideas,
can be found in [Green et al], in [Lenat], or in Appendix 1 of this paper.*.
Depending
on how you want to visualize a BEING, its   subdivisions can be called
parts, questions, facets, or slots.
Part P of BEING B will be abbreviated as B.P.


.SSEC(Facets that each concept might have)

Each facet program can be viewed as answering a certain family of questions
about the BEING (concept) in which it is embedded$$ For example, the DEFINITION
facet of COMPOSE should be able to tell if any given entity is a composition.*.
Below is the tentative set of facets that concepts can have, followed by a
brief description of what question(s) each facet answers:


.SKIP TO COLUMN 1

.BEGIN FILL SELECT 7 PREFACE 0 SPACING 0 INDENT 15 GROUP

.ONCE FLUSH LEFT PREFACE 1
⊗4RECOGNITION GROUPING⊗*  Notice when this BEING, call it B, is relevant.

 CHANGES		Is this rele. to producing the desired change in the world?

 FINAL  		What situations is this BEING rele. to bringing about?

 PAST			Where is this used frequently, to advantage?

 IDEN {not}{quick}	{fast} tests to see if this BEING is {not} currently referred to


.ONCE FLUSH LEFT
⊗4ALTER-ONESELF GROUPING⊗*  Know about variations of yourself, how good you are, etc.

 GENERALIZATIONS	What is this a special case of? How to make this more general.

 SPECIALIZATIONS	Special cases of this? What new properties exist only there?

 BOUNDARY		What marks the limits of this concept? Why exactly there?

 DOMAIN/RANGE {not} Set of (what one can{'t} apply it to, what kind of thing one {never} gets)

 ORDERING(Complete)	What order should the parts be concentrated on (default)

 WORTH	Aesthetic, efficency, complexity, ubiquity, certainty, analogic utility, survival basis

 INTEREST		What special factors make this type of BEING interesting?

 JUSTIFICATION   Why believe this? Formal/intu. For thms and conjecs. What has been tried?

 OPERATIONS  Properties associated with BEING. What can one do to it, what happens then?


.ONCE FLUSH LEFT
⊗4ACT-ON-ANOTHER GROUPING⊗*  Look at part of another BEING, and perhaps do something to it.

⊗1CHANGE⊗* subgrouping of parts:

 BOUNDARY-OPERATIONS {not}  Ops rele. to patching {messing}up not-bdy-entities {bdy-entities}

 FILLIN  How to initially fill it in, when and how to augment what is there already.

 STRUCTURE 		Whether, When, How to retructure (or split) this part.

 ALGORITHMS		How to compute this, do this activity. Related to Representation.

⊗1INTERPRET⊗* subgrouping of parts:

 CHECK   		How to examine and test out what is already there.

 REPRESENTATION  How should entities of type B be structured internally? Contents' format.

 VIEWS	(e.g., How to view any Active as an operator, function, relation, property, corres., set of tuples)
 


.ONCE FLUSH LEFT
⊗4INFO GROUPING⊗* General information about this BEING, B, and how it fits in.

 DEFINITION		Several alternative formal definitions of this concept. Can be axiomatic, recursive.

 INTU		Analogic interp., ties to simpler objects, to reality. Opaque.

 TIES   	Alterns. Parents/offspring. Analogies. Associated thms, conjecs, axioms, specific BEING's.

 EXAMPLES {not} {bdy}	Includes trivial, typical, and advanced cases of each type.

 CONTENTS       What is the value stored here, the actual contents of this entity.

.END

.FAD

.XGENLINES←XGENLINES+2

Let us take the organization sketched above as ⊗4literal⊗*; thus all knowledge in the
system is organized in packets called "BEINGs", 
with each BEING containing all the relevant facts and ideas about one single concept.
Each BEING has about 25 different slots or "parts",
with each part having a name and a value. One interpretation of this is that
the name represents a question, and the value represents how to answer that question.
The BEINGs are organized into five categories or "families", 
each containing about 30 BEINGs
to start with. The slots are organized into four categories, each with about
6 differently-named slots.

There is one distinguished structure, called CS (for Current Situation), which
holds pointers to all the recent system activities, notably those which have been
suspended, relevant new information, goals outstanding, BEINGs which were recently
judged interesting but haven't been worked on since then, recent comments from the
user, etc.

.SSEC(Overall View of Representation in AM)

The two broad categories of knowledge are definite and intuitive. To represent
the former, we employ (i) rules and assertions, (ii) BEINGs grouped into families,
and (iii) opaque Environment functions. To represent the latter, we employ
(i) abstract rules, (ii) pictures and examples, and (iii) opaque Environment functions.


Each currently popular Artificial Intelligence formalism for representing knowledge
lies somewhere along (or very near to) the ideal
"intelligence vs. simplicity/speed" tradeoff curve.  Another way to
see this is to picture each representation as some tradeoff between
structure and uniformity, between declarative and procedural
formulations.  Each idea has its uses, and it would be unwise to
demand that any single representation be used everywhere in a given
system.  One problem with the alternative is how to interface the
multiple representations.  Usually this is enough to persuade
system-builders to make do with a single formalism.  The proposed
system will be pushing the limits of the available machinery in both
the time and space dimensions, and therefore cannot indulge in such
luxuries!  Knowledge used for different types of tasks must be
represented by the most suitable formalism for each task. 

BEINGs are higly structured, intelligent, but slow. Rules and
assertions are more uniform and swift, but frequently awkward. 
Compiled functions win on speed but lose on accessability of the
knowledge they contain.  Pictures and examples are universal but
inefficient in communicating a small, specific chunk of information. 

Let us now partition the types of tasks in our system among these
various representations.  The frequent, opaque system tasks (like
evaluating interestingness, aesthetic appeal, deciding who should be
in control next) can be programmed as compiled functions (the only
loss -- that of accessability of the information inside -- is not
relevant since they should be opaque anyway). 

The specific math knowledge has many sophisticated duties, including
recognizing its own relevance, knowing how to apply itself, how to
modify itself, how to relate itself to other chunks of knowledge,
etc. It seems appropriate that BEINGs be used to hold and organize
this information. The main cost, that of slowness, is not critical
here, since each individual chunk is used infrequently, and a wrong
usage is far more serious than a slow usage.  One final factor in
favor of using BEINGs here is that all the knowledge that is
available at the time of creation of a new BEING will find its way to the
right place; any missing knowledge will be conspicuous as a blank or
incomplete BEING part. 

The contents of each part of each BEING is composed of specialized rules,
assertions, simple programs, and pointers to other parts and other BEINGs. The
knowledge may have to be altered from time to time, hence must be
inspectable and interpretable meaningfully and easily, so compiled
code is ruled out. To demand that each part of each BEING be itself a BEING
would trivially cause an infinite regress. Hence the reliance upon
"intermediate" representations. 

Communication between very different entities, for example between
the User and a BEING not designed to talk with him, are best effected via
a picture language and/or an examples language (from which the
receiver must infer the message). Such universal media are proposed
for faltering communications, for holding and relating intuitions of
the essences of the knowledge chunks stored in the BEINGs. 

The representation of intuitive knowledge as pictures and examples is
certainly not original.  Set theory books usually have pictures of
blobs, or dots with a closed curve around them, representing sets.
For our purposes, a set will be represented in many ways.  These
include pointer structures for ⊗6ε⊗*, ⊂, and their inverses; analytic
geometric functions dealing with sets as equations representing
regions in the plane; prototypical examples of sets; a collection of
abstract rules for simulating the construction and manipulation of
sets; and, finally, a set might be intuitively represented as a
square in the cartesian plane.  All these are in addition to the
definite knowlege about sets (definition, axioms and theorems about
sets, specific analogies to other concepts). 

The notion of a fuzzy rule will remain fuzzy throughout this
document. The basic idea is that of a production system, with left
sides that can latch onto almost anything, which eventually generate
lots of low-certainty results. These would augment some
BEINGs' intuition parts, and when trying to relate two given BEINGs which
both had such fuzzy abstract rules, one might try to "run" the
combined production system, or merely to "compare" the two systems. 
As with pictures and examples, the benefits of universality and
adaptability outweigh the inefficencies. 

Opaque simulations of (about a dozen) real-world situations is another important
component of the representation of intuitive knowledge. For example,
there might be a simulated jigsaw puzzle, with models of pieces,
goals, rules, hints, progress, extending, etc.  There will be a
simulated playground, with a seesaw model that will respond with what
happens whenever anything is done to either side of the seesaw. There
will be flamboyant models, like mountain-climbing; mundane ones like
playing with blocks; etc.  The obvious representation for these simulations
is compiled functions, which are automatically opaque. 

The next items of interest are which parts each BEING must have. 
In PUP6, each BEING had (theoretically) exactly
the same set of parts. In AM, each ⊗4family⊗* will have the same set.
For each possible part, we list below those families having that part:

.BEGIN W(7); TABS 30,40,50,62,74; TURN ON "\"  GROUP
@2   ↓_Part Name_↓\Static\Active\Static Meta\Active Meta\Archetypical
.TABS 34,44,57,69,80
.INDENT 6

⊗6RECOGNITION GROUPING⊗*
 CHANGES\X\X\X\X\X
 FINAL\X\X\X\X\X
 PAST\X\X\X\X\X
 IDEN {not}{quick}\X\X\X\X\X

⊗6ALTER GROUPING⊗*
 GENERALIZATIONS\X\X\X\X
 SPECIALIZATIONS\X\X\X\X
 BOUNDARY\X\X\X\
 DOMAIN/RANGE {not}\\X\\X
 ORDERING(Complete)\\X\X\X
 WORTH\X\X\X\X
 INTEREST\X\X\X\X
 JUSTIFICATION\\\X\
 OPERATIONS\X\X\X\X

⊗6ACT GROUPING⊗*
		CHANGE subgrouping of parts
 BOUNDARY OPERATIONS {not}\X\X\X\
 FILLIN\\\\\X
 STRUCTURE\\\\\X
 CHECK\\\\\X
 ALGORITHMS\\X\\X\
		INTERPRET subgrouping of parts
 REPRESENTATION\X\X\X\X\
 VIEWS\\X\X\\

⊗6INFO GROUPING⊗*
 DEFINITION\X\X\X\X\X
 INTU\X\X\X\X\X
 TIES\X\X\X\X\X
 EXAMPLES {not} {bdy}\X\X\X\X\
 CONTENTS\X\X\X\X\

⊗1**********************************************************************⊗*

.END

.NSECP(Flow of Control in AM)

.SELECT 1

Control in the system: As AM runs, at each moment, each concept
will have only a few of its facet programs filled in; most of the
facets of most of the concepts will be unknown. AM's only control
structure is to repeatedly choose some facet of some concept
(some part of some BEING), and
then use the appropriate strategic knowledge to fill in that missing
program.  The strategic knowledge will typically access and run many
other facet programs from many other concepts.  In the course of
this, new concepts may be constructed and worth giving names to$$
If one part of a BEING becomes large and interesting, it may be split into
several brand new BEINGs. This is how new concepts develop.   For
example, in filling out the Examples part of the BEING 
dealing with Composition, the system will have to think up new compositions
and relations. If one of them turns out to be interesting, it will be
made into a new BEING, and almost all of its 25 parts will be blank. So the
system will then have 25  well-defined questions to try to answer about
a specific, promising new concept.
The Examples part of the Composition BEING would now contain a pointer to
this new BEING.*.
Whenever this happens, the new concept has almost all his facets
blank, which means AM now has about 20 specific tasks to eventually
attend to (if they're deemed interesting enough). So the system never
runs out of things to do; rather, that number keeps growing rapidly. 
It is the judgmental criteria which must limit this otherwise-explosive
growth.


AM is interactive: AM informs a human user of the things it finds
which it thinks are interesting.  The user can interrupt AM and
interrogate it, redirect its energies, and so on. Since the user has
several roles$$
Creator (decide what knowledge AM starts with), Guide (by encouragement,
discouragement, suggestion, hint), Absolute Authority (provide needed facts
which AM couldn't derive itself), Co-researcher (if AM is operating in a
domain unknown to the user). *,
AM should have several modes of communicating with him.$$
Traditional math
notation, textbook English, formal (e.g. AUTOMATH or predicate
calculus), pictorial (simulated visual intuitions), examples (I/O pairs, traces).*
Protocols indicate this is feasable; one should
only require a few minutes' instruction before 
comfortably using the system. 

.XGENLINES←XGENLINES-3

.SSEC(Belief and Justification)

The system will justify what it believes with ⊗4intuition⊗* and ⊗4empirical
evidence⊗* as well as with formal reasoning. This justification 
would be recomputed only when needed (e.g., if an apparent
contradiction arose).$$
The aim is
not to build a theorem-prover, yet the leap from "very probable" to 
"certain" is not ignorable.  Many statements are infinitely probable yet
silly. Some sophisticated studies into this problem have
been done [Pietarinen] and may prove usable. 
The mechanism for belief in each fact, its certainty, should be
descriptive (a collection of supporting reasons) with a vector of numerical
probabiities (estimated for each factor) attached. These numbers
would be computed at creation of this
entity, recomputed only as required.   The most fundamental entities may
have ↓_only_↓ numerical weights. 
If the weight of any entity changes, no "chasing
around" need be done. Contradictions are not
catastrophic: they simply indicate that the reasons supporting each of the
conflicting ideas should be reexamined, their intuitive and formal
justifications scrutinized, until the "sum" of the ultimate beliefs in
the contradictory statements falls below unity, and until some intuitive
visualization of the situation is accepted.
If this never happens, then a problem really exists here, and might
have to be assimilated as an exception to some rule, might decrease the
reliability placed on certain facts and methods, might cause the rejection
of some mathematical theory as inconsistent after all, etc.
This belief algorithm, whatever its details, should be embedded implicitly in the
control environment; AM should not have the power to inspect or
modify it. *
Intuition will be simulated by functions which are opaque$$ Their knowledge is
not inspectable by the rest of the system; e.g., compiled code * and fallible.
The footnote
below gives a brief example of how they might work.$$ Consider the intuition about
a seesaw.   This is useful for visualizing anti-symmetry and symmetry,
associativity, and the concept of multiplication (distance x weight).
Our seesaw function will accept some features of a seesaw scenario:
details about each person involved in the scene (exactly where they were,
what their names were, how much they weighed)  and about the initial and final
position of the seesaw board. The intuition function SEESAW 
would accept a ↓_partial_↓ list of such
features, and then return a fully-filled out list. If the final position of
the seesaw 
is one of the omitted parameters,
the function will compute the sums of (weight x distance)
on each side, and decide what happened to the board. If some feature like a person's
name was omitted, it will fill that in by randomly choosing some name. Of course,
the caller of the function doesn't know which features are irrelevant; he doesn't
know which details of the gestalt are computed and which are random. *
Here is the most serious attack on the reliance upon divinely-provided 
intuitive abilities:
we creators might stack the deck; we might contrive just the right intuitions to
drive the worker toward making the "proper" discoveries.  The rebuttal is two-pronged:
first, one must assume the integrity of the creators; they must strive not
to anticipate the precise uses that will be made of the intuition functions. Second,
regardless of how contrived it was, if a small set of intutition models were found
which are sufficient to drive a researcher to discover a significant part of
mathematics, that alone would be an interesting discovery 
(educators would like to ensure
that children understood this core of images, for example).


.XGENLINES←XGENLINES-3

.SSEC(The Environment)

A rather specialized ⊗4environment⊗* exists to support these BEINGs. Encoded as
efficient opaque functions, the environment must oversee the flow of control
in the system (although the BEINGs themselves make each specific decision as to
who goes next). It must also include evaluations of belief, interest, supeficiality,
safety, utility; it must keep brief statistics on when and how each part of each
BEING is accessed; and the environment must maintain a rigidly-formatted
description of the Current Situation (abbreviated CS; this 
structure also includes summaries of recent system history).
When a part is big and heavily accessed, detailed records
must be kept of each usage (how, why, when, final result) of each ⊗4subpart⊗*.
Based on this, the part may be split into a group of new BEINGs, and the value
of the part replaced by a pointer to a list of these new BEINGs. 

The environment would have to accept the returning messages of the attempt to
deal with a certain part of a certain BEING. A success or a failure would mean
backing up to the last decision and re-making it
(usually the top-level "select (P,B) to work on next" decision).
An "interrupt" from a trial
would mean "here is some possibly more interesting info". The environment
must decide if it is in fact judged to be more interesting; 
if not, control is returned to the interrupted process.
If so, control automatically switches to that new, interesting part.
Later, there will be no automatic return to the interrupted
process, but whatever sequence of decisions led to its initiation may very
probably lead there again.
Two tricks are planned here. One is a cache: each BEING will let  its
RECOG parts store the last value computed, and let each
such part have a quick predicate which can
tell if any feature of the world has changed which might affect this value.
If not, then no work is done; the old value is simply
returned. If x is interrupted,
an auxilliary development is begun, and then work on x should continue,
most of the decisions leading back to x will probably not involve any real
work, since most of the world hasn't changed. The second trick is that to
evaluate a part, one moves down its code with a cursor, evalling. When
interrupted, that cursor is left just at the point one wants to start at when
the work resumes.

New BEINGs are created automatically if, when a part is evaluated and a new entity
formed, the entity has sufficient interest to make it worth keeping by name.
Also, an existing part P of a BEING B may be replaced by a list of new BEINGs.
The details of when and how to do this restructuring of B.P are stored under the 
Structure part of the archetypical BEING whose name is P. The typical process is as follows:
The environment keeps loose checks on the size and usage of each part; if one
ever grows and eats up much time, it is carefully monitored. Eventually, its
subparts may partition into a set whose usage is nearly disjoint. If this
is seen, then the part is actually split into a new set of BEINGs.
If a new BEING doesn't live up to its expectations, it may be destroyed or enslaved
(overwritten and forgotten; perhaps enough is remembered to not waste
time later on the same concept; perhaps it is denegrated into just a subpart of some
part of another BEING.)

Here is a more detailed (but still tentative) description of some of these top-level
environment routines:

COMPLETE(P,B) means fill in material in part P of BEING B. 

.QP2←PAGE

@21. Locate P and B.⊗*
If P is unknown but B is known, ask B.ORDERING
and up⊗A*⊗*(B).ORDERING. Also,
there may be special information
stored in some part(s) of B earlier, by other BEINGs, which make them more or less
promising to work on now$$
up↑n(B).P means the set of BEINGs named P, B.P, 
(B.Ties.Up).P,
((B.Ties.Up).Ties.Up).P,
etc.*.

If B is unknown but P is known, ask P and ask each BEING about interest of filling in P.
Each BEING runs a quick test to see if it is worth doing a detailed examination.
Sometimes the family of B will be known (or at least constrained).

If neither is known, each BEING must see how rele. it is; the winner decides on P.
If there is more than one BEING tied for top recognition, then place the results
in order using the environment function ORD, which examines the Worth components
of each, and by using the value of the most promising part to work on next for each
BEING. The frequent access to the (best part, value) pair for each BEING means that
its calculation should be quick; in general, each BEING will recompute it only when new
info. is added to some part, or at rare intervals otherwise.
After ranking this list, chop it off at the first big break in values, and print it out
to the user to inspect. Pause WAIT seconds, then commence work on the
first in the list. 
WAIT is a parameter set by the user initially$$   0 would mean go on unless user
interrupts you, infinity would mean always wait for user's reply, etc.*.
When you finish, don't throw the list away until after the
next B is chosen, since the list might immediately need to be recomputed! 
If the user
doesn't like the choice you've made, he can interrupt and switch you over.
A similar process occurs if P is unknown, (except the list is never saved).

@22. Collect pointers to helpful information: ⊗*
 Create a (partialy ordered) plan for dealing with part P of BEING B (abbreviated B.P).
 This includes the P.FILLIN part, and in fact any existing up⊗A*⊗*(B).P.FILLIN, and
 also some use of the representation, defn, views, dom/range parts of the P BEING.
 Consult ALGORITHMS and FILLIN parts of B and all upward-tied BEING's to B.

@23. Decide what must be done now⊗*; 
 which of the above pieces of information is "best". Tag it as having been tried.
 If it is precisely = one currently active goal, then forget it and go to 3.

@24. Carry out the step.⊗* (Evaluate the interest of any new BEING when it is created)
 Notice that the step might in turn call for accessing and (rarely) filling
 in parts of other BEINGs. This activity will be standard heirarchical calling.
 As parts of other BEINGs are modified, update their (best part, value) estimate.

@25. When done, update.⊗*
 Update statistics in B, P, and current situation. (worth and recog parts)
 If we are through dealing with B.P (because of higher interest entity ∃,
 or because the part is filled in enough for now) goto 1; else goto 3.
 If you stop because of higher interest entity, save the plan for P.B inside P.B.


CURRENT SITUATION is a vector of weights and features of the recent behavior of the system.
.FILL

The Environment also maintains a list of records
and statistics of the recent past activities, in a structure called CS, 
for "Current Situation".
Each Recognition grouping part is prefaced by a vector of numbers which are
dot-multiplied into CS, to produce a rapid rough guess of relevance.
Only the best performers are examined more closely for relevance.
The representation of each CS component is (identification info, motivation,
safety, interest, work done so far on it, final result or outlook). The
actual components might be:
.NOFILL; PREFACE 0

Recent Accesses.   For each, save (B, P, contents of subpart used).
Recent Fillins.    Save (B, P, old contents which were altered).
Current Hierarchical History Stack.  Save  (B, P, why).
Recent Top-level B,P pairs.
A couple significant recent but not current hierarchical (B,P,why) records.
A backward-sorted list of the most interesting but currently-deferred (B,P) fillins.
A few recent or collossal fiascos (B, P, what, why this was a huge waste).


ORD(B,C)  Which of the recognition-tied BEINGs B,C is potentially more worthwhile?

.FILL

This simple ordering function will probably examine the Worth vectors,  perhaps
involving the sum of weighted factors, perhaps even cross-terms such as
(probability of success)*(interest rating).

.BEGIN SELECT 6; NOFILL; PREFACE 0


PLAUSIBILITY(z)       How believable is z?    INTEREST(z)    How interesting is z?

         each statement has a probability weight attached to it, the degree of belief
         this number is a fn. of a list of justifications
	 Polya's plausibility axioms and rules of inference
         if there are several alternate justifs., it is more plausible
         if some consequences are verified, it is more plaus.
         if an analogous prop. is verified, it is more plaus.
         if the consequences of analogue are verif., it is slightly more plaus.
         the converses of the above also hold
         believe in those things with high enough prob. of belief (rely on them)
         this level should fluctuate just above the point of belief in contradictions
         the higher the prob., the higher the reliability
         the amt. one bets should be prop. to the reliability
         the interest increases as the odds do
         Zadeh: p(∧) is min, p(⊗6∨⊗*) is max, p(¬) is 1-.
         Hintikka's formulae (λ, αα)
         Carnap's formulas (λ)
         p=1 iff both the start and the methods are certain ←← truth
         p=0 iff both start is false and method is false-preserving ←← falsity
	 p is higher as the plausibility is higher, and as the interest is lower
         if ∃ several alternative plaus. justifs., p is higher
         don't update p value unless you have to
         update p values of contradictory props.
         update p values of new props
         maybe update p value if it is a reason for a new prop
      empiricism, experiment, random sampling, statistics
         true ideas will be "verified" in (consistent with) any and all experiments
         false ideas may only have a single exceptional case
	 a single exception makes a universal idea false
         nature is fair, uniform, nice, regular; coincidences have meaning
         more plaus. the more cases verified
         more plaus. the more diff. types of cases verified
         central tendency (mean, mode, median)
         standard deviation, normal distribution
         other distributions (binomial, Poisson, flat, bimodal)
         statistical formulae for significance of hypothesis
      regularity, order, form, arrangement
         economy of description means regularity exists
         aesthetic desc (ana. to known descs. elsewhere)
         each part of desc. is organized regularly
         the parts are related regularly

  Below, αα means ⊗4increases with increasing⊗* (proportionality), and
  αα⊗A-1⊗* means ⊗4decreases with increasing⊗* (inversely proportional).
  Perhaps one should distribute these morsels among the various concerned BEING's:
   Completeness of an analogy  αα  safety of using it for prediction
   Completeness of an analogy  αα⊗A-1⊗* how interesting it is
   How expected a relationship is  αα⊗A-1⊗*  how interesting it is
   How intuitive a conjec/relationship is  αα⊗A-1⊗*  how interesting it is
   How intuitive a conjec/relationship is  αα  how certain/safe it is
   How superficial something is  αα  how intuitive it is
   How superficial something is  αα  how certain it is
   How superficial something is  αα⊗A-1⊗* how interesting it is

  Perhaps included here should be universally applicable algorithms for applying these rules
  to choosing the best strategies, as a function of the situation.

   One crude estimate of interest level is the interest component of the current BEING's
   Modify this estimate in close cases using the above relations
   Generally, choose the most specific strategies possible
   If the estimated value of applying one of these falls too low, try a more general one
   Rework the current BEING slightly, if that enables a much more specific strategy to be used
   Locate specific concepts which partially instantiate general strategies
   The more specific new strategies are associated with the specific info. used
   Once chosen, use the strategies on the most promising specific information
   If a strat. falters: Collect the names of the specific, needed but blank (sub)parts
      Each such absence lowers int. and raises cost, and may cause switch to new strategy
      If too costly, low int, store pointer to partial results in blank parts 
         The partial results maintain set of still-blank needed parts

   Competing goals: On the one hand, desire to maximize certainty,
      safety, complete analogies, advance the level of intuition.
      On the other hand, desire to maximize interestingness, find poss. and poten. interesting ana.
       find unexpected, nonsuperficial, and unintuitive relationships.
   If an entity is used frequently, it should be made efficient.
      Conversely, try to use efficient entities over nearly
      equivalent (w.r.t. given purpose) but inefficient ones.
   If an entity is formally justified but resists intuitive comprehension, its use is
      dangerous but probably very interesting; ibid for intuitive but unprovable.
   Resolve choices in favor of aesthetic superiority

   Maximize net behavior
    Maximize desired effects
      In this case, prefer hi interest over hi safety.
      Generally preferred to the folowing case.
    Minimize costs, conserve resources
      In this case, prefer safety to interest.
      Locate the most inefficient, highest-usage entity, and improve or replace it
      Use: If time/space become a problem, worry about conservation until this relaxes.
.END

.NSECP(Details: Getting AM going)

This diversion is only for the interested reader. It may be skipped without
loss of continuity. It deals with short-cuts which can be taken by myself
to get AM running faster and more easily.

When AM proposes a conjecture, the user may interrupt and say "Assume it is true,
and go on". If he does this consistently, we can debug AM's higher-level
proposing abilities even before its lower-level proving abilities are perfected.

We may be able to beat the "critical mass" constraint slightly.
The vast amount of necessary initial knowledge can be 
generated from a much smaller
core of intuition and definite facts, using the same collection of
strategies and wisdom which also drive the later discovery and the 
development.
Since AM has been given good knowledge-acquisition (strategic) powers,
AM itself should be able to fill in the blank parts of the concepts it starts
with.$$
When you are preparing the knowledge to initially get AM going,
the way to save some effort is by 
omitting whole categories of knowledge which you can expect
the system to discover quickly for itself.
AM has to have good methods for generating examples of whatever
concepts it later comes up with; those same methods can be used to produce
examples of the initially-supplied concepts. This means that the creators can
omit all examples of the concepts originally supplied; the system
will be able to fill in all such gaps.
Notice that AM has to be able to fill in ↓_all_↓ the parts (slots)
of each newly discovered concept, so perhaps only the barest definition or
intuition need be initially taught about each concept.*
The only constraint on 
such a partially-equipped
system is that it first develop a broad base of
intuition, examples, and interrelations, spanning several areas of interest,
before trying to  develop any one particular area in depth.

.COMMENT XGENLINES←XGENLINES-2;

A balanced distribution of intelligence ought to exist: related facets which
are rarely accessed separately should be coalesced, and any single type of facet
used very heavily should be split. Notice that this theme has two quite different
realizations. During run time, this policy would refer to the contents held in
existing BEINGs' parts$$ E.g., the Structure part exists only to indicate what to
do about a BEING's 
part getting too big.*. Before the system is actually created, however,
this policy means altering the proposed anatomy of BEINGs $$ Proclaiming that some
family has to have this extra part present, that these three parts can be replaced
(in all BEINGs of a given family) by the following more general part, etc.*.
The runtime restructurings occur based on knowledge recently created by the
system; the planning-stage restructurings are now being based on results from
hand simulations of the system.  Remember: during runtime, the set of parts that
any specific BEING can ever have is fixed by the family (profession) to which he
belongs.

One difference from PUP6 is that in AM the BEINGs are grouped into families.
Each family has its own set of parts (although there will be many parts present in
many families, e.g. 
Iden). For each family F there will be a fairly general BEING named
F. Under each part of F
is general information which, 
though not applicable to all BEINGs, is applicable to all
BEINGs belonging to family F.  
Similarly, if P is a part name, then the BEING named P contains
information which is useful for dealing with part P of any BEING. There might also exist
an archetypical BEING named F.P, who would have special information for working with
part P of any BEING in family F.  There might even be a BEING called B.P, where B is
some specific BEING, with information that just deals with part P of B and any future
specializations of B. The information stored inside a part of a BEING, for example
the actual contents of B.P, would be 
code capable of computing
B's answer to the question P; the previously
mentioned archetypical BEING named B.P 
would contain strategies for dealing with such
answering code (how to fill it in, how to check it, etc.).
To reiterate:   the contents of a part
are specific knowledge, a little program which can answer a specific query, whereas
the contents of the parts of an archetypical BEING are partially ordered sets
of strategies
for dealing with that part of that type of BEING (how to
extend it, what its structure is, and so on).
Notice we are saying that all the parts with
the same part name, 
of BEINGs in the same family, must all have the same structure$$
This is one additional level of structure from the BEINGs proposed in PUP6.*.

.SELECT 1

When part p of BEING B is filled out, at some point in the sequence S of strategies
listed under the archetypical BEING named B.p or p, some new 
information may be discovered. If S cannot handle this knowledge, then  it will
simply return with the message "I am not through, but here is some fact(s) which
may mean that filling out part p of B is no longer the best activity".
The selected part and BEING may turn out the same, or may change due to the new
info which was just uncovered.
The flavor of the return should thus be one of: Not Done because x is
possibly more interesting; Not Done because x is a prerequisite to doing me;
Done because I succeeded; Done because I failed utterly.

The lower-level BEINGs will provide fast access to well-organized information.
The background environment provides the necessary evaluation services at
high speeds (though the system cannot meaningfully examine, modify, or add to
what environment functions the creators provide).
The BEINGs hold "what to think"; the environment implicitly controls "how to think",
and the archetypical BEINGs explicitly contain hints for "how to think".
The big assumption is that one may think creatively without completely 
knowing how his thought
processes operate; intelligence does not demand absolute introspection.

Each clump is (at least partially) ordered, hence can be executed sequentially.
The result may be to choose a lower-level clump, and/or modify some strategies
at some level (some part of some BEING), and/or create new strategies at some
level (perhaps even to create a new BEING). These lattter creations and calls
will be in the form of strong suggestions to the environment.

Common knowledge should in some cases be factored out. Possibilities:
(i) always ask a specific BEING, who sometimes queries a more general
one if some knowledge is missing; (ii) always query the most general
BEING relevant, who then asks some specific ones (This sounds bad);
(iii) ask all the BEINGS pseudo-simultaneously, and examine the
responders (this sounds too costly.) The organization of BEINGs into
hierarchical groupings reflects the spirit of (ii). A BEING only
contains additions and exceptions to what its generalization contains,
so (i) is actually the dominant scheme now envisioned.

.SELECT 1

There are two kinds of discovery, evolution of abilities, which are specifically
disallowed: (i) the modification of the strategies and
control mechanism themselves$$
Since the precise strategies are so crucial, it might be
advantageous to allow them to evolve. This includes changing the system's
notion of interestingness as its experience grows.
This  was decided against because our thesis is that the same strategies
useful for dealing with premathematical concepts should suffice no matter how
far the system progresses.   One mechanism to effect this kind of development
is the standard sort of feedback paradigm, which in AM's case seems
one full step too ambitious.
Intuitions and strategies may be inspected and altered, just as any other
facets of BEINGs, but
the notions of how to judge interestingness, belief, safety, difficulty, etc. 
(plus all the algorithms for split-msecond estimating and updating these)
are fixed for
AM by its creator. If they are unsatisfactory, he must retune them.
*, and (ii) the introduction of brand new ⊗4kinds⊗* of
questions, of new parts of a type not forseen at time of system creation.$$
The postulating of new kinds of parts is a very tricky process, and like the
first exclusion, I feel that the kinds of things one wants to know about
a concept shouldn't vary too much. A mechanism for introducing new kinds of
BEING parts is the following:
whenever a very interesting property P is discovered, with interesting variations,
which applies to many BEINGs, then assert that P is a new
part (slot, question), and that every BEING which has some variation of 
this property should fill
in a slot labelled P, with a description of its variation. 
The common link between all the
BEINGs having  property P should be found, and any new BEING possessing this
characteristic should eventually try to see what variation of P it 
possesses. That is, it should try to fill in a slot labelled P. *

.NSECP(Examples of Individual Modules)

Perhaps it would be beneficial to glance over the file GIVEN at this point.
Each page there contains information relevant to one concept, broken down by facets.
Since that's a couple hundred pages long, we'll just glance at two
typical BEINGs: EXAMPLES and COMPOSITION.
All the 150 BEINGs in GIVEN will eventually be coded by hand and fed to AM.
Since the BEINGs are all considered equal, there is in fact no atypically
interesting one which can be exhibited. So the two below are just mediocre;
they aren't worth poring over.

.SSEC(The "EXAMPLE" module)

<Look at Examples BEING, from the Given file>
This is the BEING named ANY-BEING.EXAMPLES.
Notice the general hints listed under FILLIN, for filling in the examples
of any given concept. β is an abbreviation for BEING. CHECK explains
how to check any unclear example, and   REPR       is the format to keep
each Examples part in. The Structure part says that we can split off any
interesting example into a new BEING with no bookkeeping required.

.BGIV 



⊗2↓_EXAMPLES_↓⊗* {not} {bdy}	Includes trivial, typical, and advanced cases of each type.

⊗1ACT GROUPING⊗*
 FILLIN    Specialize β (in various ways) until only one entity is specified.
	Any kind: instantiate specializations, and/or defn, and/or intu.
		Inefficient: scan related specific entities until an ex. is found
			also: look for op. F: αα→β; then use f(αα.Examples).
		Consider ana. β's examples, map them back. Examine similar β's exs.
	Special: To get an example that satisfies P, bear P in mind at each spec. choice point.
		Try setting various parameters s.t. they are equal to each other.
	Simplest: plug in simple distinguished concepts into defn/intu. until singleton.
		Consider the trivial case, the empty or minimal extreme, the base step of a recursive defn.
	Big: Plug sophisticated, big, varied  examples of each variable concept into defn/intu.
		If recursive, try to invert it so as to be able to build up harder and harder exs.
			Done: only use it to find a few actual examples.
			Save the inverted procedure for later usage, though.
		Pick some decent example, find a chain of simpler and simpler examples
		leading backward, then turn around and extrapolate off the hard end.
	Prototypical: so representative that they will be useful in the future for
		testing a conjecture for empirical plausibility. 
		Potentially: formally prove that any tie to this ex. is a valid tie.
	Boundary: keep an eye out for two similar examples, one in and the other out.
			(these can come from work on this part and/or from similar B's exs.)
		When such a pair is found, begin transforming them into each other,
		and zero in on the most similar pair of in/out entities. Examine carefully.
	Afterwards: if ¬∃ exs. before, ∃ exs. now, ∃ op O on β with no O.Exs, then
		place into O.Exs. this suggestion:  can plug eles. of β.Exs into O.
 STRUCTURE   Split off any indiv. or class of high-int. examples, keep the rest here.
 CHECK   Each ex. should satisfy the defn. and intu parts.
	If any type of new procedure (e.g., inverted recursive defn) has been created,
	which supposedly generates β's, try to conjecture about the uniqueness of the
	members of the sequence produced, the totality of their covering of all possible β's, etc.
 REPRESENTATION  (name, type, (part present, value, change from β)*, origin, complexity,
	distance from the boundary of β)*
	Data about examples should be stored under part headings (easy to make them BEINGs).
	The types of type are: +, + boundary, -, - boundary.
	The types of origin are: by specializing defn, by intu, to satisfy property.

 
⊗1INFO GROUPING⊗*
 DEFINITION  Specific entities which are (not)(barely) instances of β's.
 INTU  Zero in on specific representatives, individuals.
 EXAMPLES  Access the Examples part of any BEING, and it contains examples of that.
 TIES  Up: Info group.   Side: Specializations.
.END

.GROUP SKIP 2
.SSEC(The "COMPOSE" module)

<Look at Compose BEING, from the Given file>
.BEGIN PREFACE 1 TURN OFF "-"

RECOG:Here are productions which might match against the current situation.

INTEREST: Here are some special features that make a composition interesting.
Since composition is a type of ⊗4operation⊗* which is a type of ⊗4activity⊗*,
which is a type of ⊗4any-Being⊗*,  these hints are
all in addition to (or exceptions to) the hints listed under 
OPERATION.INTEREST and ACTIVITY.INTEREST and ANY-BEING.INTEREST.

VIEWS: Here is how to view this as a set, an operator, a relation, a property.

INTU: This specifies that we may abstractly manipulate expressions about
compositions by thinking in terms of arrows landing and being refired again.
This meshes with the intuition of an OPERATION as a set of arrows being fired
from one set into another.
.END

.BGIV


⊗2↓_COMPOSITION_↓⊗*    Apply an operation to the result of a previous operation.

⊗1RECOGNITION GROUPING⊗*
 CHANGES  {(eles. of dom. of 1st changed to some eles. of range of 2nd., .90, .80, )}
 FINAL {(2nd range ele., .20, .70, intu: final product of a 2-step manufac. process)}
 PAST
 IDEN {not}{quick}    This must be filled in for AM initially, but hasn't been yet.

⊗1ALTER GROUPING⊗*
 GENERALIZATIONS    Sequence of actions, an ordered pair of operations to perform.
	FILLIN: Increase 1st true domain, typically by modifying the definition.
	Increase 2nd range.
 SPECIALIZATIONS  Given an Active F:AxB→C, and an Active G:CxD→E, consider the new Active
	h:(AxB)xD→E, written g*(fxi), defined as h((a,b),c)= g(f((a,b)),c). Alternatively,
	given f:BxD→C and g:AxC→E, construct h:(AxB)xD→E as h((a,b),c)= g(a,f((b,c))).
	An even further specialization: let some of A,B,C,D,E coincide.
	An even further specialization: let all of A,B,C,D,E coincide.
 BOUNDARY  {(second operation doesn't care what the result of the first was, sequenceing unnec))}
 DOMAIN/RANGE {not}   (AOP⊗A2⊗*, OP, op)
	FILLIN: domain is a pair of operations, range is a new operation whose domain
		is the dom. of the 1st pair element, and whose range is the range of the 2nd.
		int. depends on the 2 ops and on props which this particular op possesses;
		if high enough and no ptr, BEINGize this new op.
 ORDERING(Complete)
 WORTH (.8, .95, .5, .5, .6, .9, .5, (1.0 formal, .8 intu), .8, (1.0 basis))
	FILLIN: To forestall an infinite regress, decrease the activation energy of this investigation.
 INTEREST  Int. property of result which is not true of either argument relation.
	Int. properties of both argument relations are preserved, undesirable ones lost.
	Int. subsets (cases) of domain of 1st map into interesting subsets of range of 2nd.
	Preimages of int subsets (cases) of range(2nd) are themselves interesting subsets of domain(1st).
	The range of the first is equal to, not just a subset of, the domain of the second.
 OPERATIONS

⊗1ACT GROUPING⊗*
 BOUNDARY-OPERATIONS {not}  
 ALGORITHMS    
	FILLIN: Sequence: do 1st op (ALG), then take result and do 2nd op on it (ALG).
 REPRESENTATION  repr. as any operation, or perhaps just as 2nd o 1st.
 VIEWS  to view as a reln: view each arg. op. as a reln, R↓1 ⊂ AxB, R↓2 ⊂ BxC, composition
		is a relation R↓2 o R↓1  ⊂  AxC, where (a,c) is in composition iff
		(a,b) ε R↓1   and   (b,c) ε R↓2.
	to view as op: domain is dom of 1st, intermed. activity, then range is range of 2nd.
	to view as a prop:  prop. master set is (same notation) AxC; R↓2oR↓1(a,c)) ↔
		R↓2(R↓1(a), c).
	to view as a set: set of ordred pairs which is a subset of dom↓1 x range↓2.

⊗1INFO GROUPING⊗*
 DEFINITION   The sequencing of two operations, f and g, where domain of f contains the
	range of g as a subset. E.g., f:A→D, and g:W→B with B⊂A.
	Then the new combined operation is the composition of f and g, written
	f o g, f(g), fg. The computation is straightforward; apply g and then apply
	f to the result. 

	If f:AxB→C,   g:D→A,  and h:E→B, then one can consider the composition f o gxh,
	also written f(g,h), whose value on (x,y) ε DxE is f(g(x),h(y)) ε C.
	This is combinatorially explosive. Suggestions for containment: possibility of
	considering f o gxg;  f o ixj  where at least one of i,j is the identity,
	especially if the other one of i,j is equal to f itself; f o ixj, where one
	of i,j is f, occasionally both are f. In general, f o gxh is about as intersting
	as the ties between the three functions are already (equality is quite high).

	To expand this flexibility, view any Active as an operation for these purposes.
	The composition FoG is more formally defined as the specific set of ordered pairs
	(a,B) where aεdom(G) and B⊂range(F), and B={F(c) | cεG(a)}, together with an
	explicit statement of the domain and range of F and of G.
 INTU   Arrows emanate from one set, go into another set, new arrows go from there to
	a third set. The composition means follow along and transfer to new arrows. 
 TIES   Up: Operation  
 EXAMPLES {not} {bdy}
 CONTENTS
.END

.SSEC(Intuition for a Set)

Here are some of the intuitive images which must be simulated by the program
in slot "INTUITIONS" of the BEING named "SET".

 
Our first characterization of a set will be as a
solid rectangle in the Cartesian plane; the opaque
intuition function knows about numerical equality and inequality, hence about
borders of such sets. The notions
of intersection, union, complement, setdifference, disjointness, projection
onto each axis, etc. are also intuitively available.  Notice that the
sophisticated operations required (e.g., projection) will exist as opaque
functions, totally inaccessable to the rest of the system$$ This is worth
rejustifying: is fair to write a LISP program (which uses the function
TIMES) whose task is to synthesize code for the function TIMES, so long as
the program does not have access to, does not even know about its use of
TIMES. *.

This "square" representation is not well suited to all concepts
involving sets.
For that reason,
the system will simultaneously maintain several of the other forms of
intuitive storage mentioned previously.  Consider, for example, the
possibility of fuzzy rules, which can latch onto almost anything
and produce some type of result (but with low certainty). That is, they
operate at a higher level of abstraction than definite rules, by ignoring
many details. Another possibility is the use of examples. If a small set of
them can be found which is truly representative of a concept, then future
references to that concept can be compared to these examples.  This may
sound very crude, but I believe that people rely heavily (and
successfully!) on it.

Euler, to overcome language problems when lecturing a princess of
Sweden, devised the use of circles to represent sets. Venn and others
have frequently adopted this image. For a machine, it seems more a
propos to use a rectangle, not a circle.  Consider  the lattice of
integral points in two dimensions. Now a set is viewed as a rectangle
-- or a combination of a few rectangles -- in this space. This makes it
hard to get any intuition about continuity or boundary or openness, but
works fine for the discrete sets which are dealt with in logic, 
elementary set theory, arithmetic, number theory, and algebra. It is
probable that the system will therefore not be tried in the domains of
real analysis, geometry, topology, etc. with only this primitive notion
of space and confinement.  Specificly, a set in this world is an
ordered pair of pairs of natural numbers. Projection is thus trivial
in LISP (CAR or CADR), as is test for intersection, subset, etc.
Notice that these require use of numbers, ordering, sets, etc., so the
functions which accomplish them must be opaque.  The interaction
with the rest of the system will be for these pictures to suggest and
reinforce and veto various conjectures.  They serve to generate
empirical evidence for the rest of the system.
To avoid gerrymandering, it might be necessary to view a set as a list
(of arbitrary length) of ordered pairs; an absent pair can be assumed to be
some default pair$$ That is, a set is a simplex in Hilbert space; each set has
infinite dimension, but differs from any other in only finitely many of them.*.

Still other representations will be maintained: pointer structures (graphs where
each arc represents a relation, e.g. ⊗6ε, ⊂⊗*), activities (the characteristic
function for the set), algebraic equations describing regions of space, etc.

How should the system choose which intuitive representation(s) of a set to use?
Some considerations are: 
	What operations are to be done to this set
(e.g., ⊗6ε⊗*, ⊂, ∩, ∪, ⊗6≡⊗*, =, ',...)? The representations differ in cost of
maintenance and in the ease with which each of these operations can be
carried out. 
	How artificial is the representation for the given set?
Some will be quite natural, e.g., if the set is a nest then use the
pointer structure; if the set is a relation over the small set AxB, then use the
lattice points representation.
	How much is "given away" by the model? This is a
question of fairness, and means that the system-writers must build in
opacity constraints and/or make the intuitive operations faulty.
We shall do both.
	How compatible is each representation with the computer's 
physiology?  Thus it is
almost impossible to represent pictures or blobs directly, but very
suitable to store algebraic equations defining such geometric images.
	Does the representation suggest a set theory with basic elements 
which are non-sets; with an infinite model; with any special desirable or
undesirable qualities? For example, the geometric representation
seems to demand the concept of continuity, which the system probably
won't ever use in any definite way.


There are about 150 BEINGs in the proposed core, 
and each one of them should have an intuition almost
as rich as that for SETS, above. Space precludes delving into each one; some few
lines about each β's intuition is present in the document "⊗4GIVEN KNOWLEDGE⊗*".$$
This  is available as the file named GIVEN, on the directory [AM,AJC], at the
Stanford AI Lab, which is ARPA-NET site named SAIL.*
.NSECP(Example: The Modules Interacting)

.SSEC(Some Early Numerical Concepts)

In order to be interesting, the example session in the next section
takes place after AM has learned
a few numerical concepts. To make this more plausible, let me briefly show how some
of these might be discovered "naturally" by AM.

One of the things AM knows about relations is when they are interesting.
(The facet named "Interestingness" on the concept named "Relation".)
These
features include: (i) If R is a relation from A to B, then R is interesting if
the image of each element ⊗6xεA⊗* satisfies some B-interesting property; or,
(ii) all (x,y) pairs, for ⊗6x,yεA⊗*, are in some  B-interesting relation.
Suppose AM is considering only relations from sets to sets, so A and B are two sets.
Then
SET.INTEREST is accessed, and he tells the system  that
some interesting set properties and relations are known to be: singleton,
equal, disjoint. These lead, respectively, to the concepts of
Function (all images are singletons), Constant (all images are equal),
1-1 (no two images intersect).

.ONCE TURN OFF "-"
As another example, a composition is known to be interesting if its component
relations are, and if its domain and range are related by an interesting relation.
This leads to considering MAP-STRUCTURE(REVERSE-ORDERED-PAIR); that is, take
a relation, view it as a set of ordered pairs, then reverse each one, then
view the resultant set as a relation again. 
This is just the concept of Inverse, and is interesting because it takes
relations on CROSS-PRODUCT(A,B) into relations on 
CROSS-PRODUCT(REVERSE-ORDERED-PAIR((A,B)).

For our example, all the following concepts should be developed already:
Count$$ This operation converts any list (of length n)
into canonical form (perhaps a list of n NILs).*,
Inverse, Commutativity, Transitivity, Associativity, Singleton, Function,
Successor, Zero, Plus, Times, One, Two.

.SSEC(A Hypothetical Session)

Assume that AM has a grasp of the early numerical concepts mentioned, as well
as of premathematical ones.
We are now ready to examine the
example of how a  session with AM might appear to the user.
In a few minutes,
we'll go into detail about how AM actually does this.
Suppose that AM has just considered the concept of 
repeated TIMES-ing, that is, exponentiation.


<THE FOLLOWING IS THE COMPLETE SLIDE; AFTERWARDS IS THE SET OF ACCOMPANYING PATTER>

.BEGIN SELECT 6 FAS INDENT 0,3,0

1. SYSTEM: Rate of drop of interest in Repeated TIMES-ing
(compared to SUCCESSOR, PLUS, TIMES) is huge.
Not associative or commutative; no Left identity element.
Dissuades me from pursuing higher order constructed operations.

2. SYSTEM: I am now investigating the inverses of Plus and Times.
In general, 
the inverses
are not functions. 

3. SYSTEM: Looking at more examples of TIMES⊗A-1⊗*. 
TIMES⊗A-1⊗*(n) seems always to include, as one ordered pair,
(n, 1). Same for (1, n).
Conjecture: (∀xεN)  (x,1) ε TIMES⊗A-1⊗*(x), and (1,x) ε TIMES⊗A-1⊗*(x).

4. SYSTEM: Proof is simple: TIMES(x,1)=x=TIMES(1,x).

5. SYSTEM: The two pairs (n,1) and (1,n) 
are distinct iff the number n is not equal to 1.
Noticing the number 1 as an exception, we can state:
Corollary: COUNT*TIMES⊗A-1⊗*(1) = 1;  (∀x>1) COUNT*TIMES⊗A-1⊗*(x) ≥ 2.
COUNT * TIMES⊗A-1⊗* is the how many ways one may express a number as a product
of an ordered pair of numbers.

6. SYSTEM: Interesting to consider the subset P of N, 
defined as {xεN | COUNT*TIMES⊗A-1⊗*(x)=2}.

7. USER: (interrupting) Why is this interesting?

8. SYSTEM: At the moment, we have the known fact f(x)≥y; try to pin down extreme cases,
those x for which f(x) is precisely equal to y. f is COUNT*TIMES⊗A-1⊗*, in the current
case, and y is 2.  Do you have any special names you would like me to refer to
COUNT * TIMES⊗A-1⊗* by? Any special names for the subset P?

9. USER: Call that composition FAC.
Call the set P, of extreme numbers under FAC, PRIMES.

10. SYSTEM: Syntactic transformation: PRIME is the predicate which means "is in PRIMES".

11. SYSTEM: TIMES may also be viewed as operating on a BAG; 
in this sense, TIMES⊗A-1⊗* would 
return all possible bags (of any length) whose product is the given number. 
We do not allow bags to have <2 members, and we do not allow bags longer than 2
to have any 1's in them.
In this case, (5) just says: COUNT(TIMES⊗A-1⊗*(x)) ≥ 1. 
The extremes now are just 1 and PRIMES.   
I want a new name for this new interpretation of TIMES⊗A-1⊗*.

12. USER: Call it FACTORS.

13. SYSTEM: Conjecture: FACTORS(x) always seems to contain a bag containing only
ones and primes, and only one such bag. 

14. SYSTEM: Proof of existence is by contradiction,
using the lemma  ¬PRIME(x) → ∃(u,v)εTIMES⊗A-1⊗*(x) s.t. 1<u<x, 1<v<x.
Proof of Uniqueness is by contradiction, using the lemma 
(∀a,bεN, xεPRIMES). x|TIMES(a,b) iff x|a or x|b.

15. USER: Call this the unique factorization theorem. This is very important.
Consider now the sum of all the divisors of a number.

16. SYSTEM: PLUS * UNION * TIMES⊗A-1⊗* is being investigated. Seems to range from
a low extreme of Successor(the number) up to a high extreme of almost  
Times(the number, itself). The numbers at the low extreme seem to be the primes.
The numbers near the high extreme may not be worth naming. An intermediate point
is twice the number itself. 
That is, those numbers n s.t. Times(2,n)= PLUS*UNION*TIMES⊗A-1⊗*(n).
Another intermediate set of points is PRIMES;
that is, those numbers n s.t. PRIME(PLUS*UNION*TIMES⊗A-1⊗*(n)) holds.
This may lead to the conjecture that P*U*T(N) = PRIMES.

17. USER: Call these =2n numbers "Perfect Numbers." Call the others nothing.
There are not many interesting things to prove involving either idea.

18. SYSTEM: Going on to new topic, then. I shall
consider comparing TIMES⊗A-1⊗* collections from two numbers.

<this leads to GCD and relative primeness>

.END

.SSEC(Comments on the Session)

.BEGIN SELECT 6 FAS INDENT 0,3,0

1. $$
The numbering here corresponds to that in the last section.*
AM is not thrilled with exponentiation, so it doesn't even
consider the hyper-(repeated-)exponentiation operation.

2. SYSTEM: I am now investigating the inverses of Plus and Times.
In general, the inverses are not functions. 

3. AM notices that (x,1) and (1,x) are always allowed factorizations of x.

5. A corollary is that TIMES⊗A-1⊗* has at least two members.

6. AM now decides to consider just those numbers whose 
TIMES⊗A-1⊗* has ⊗4precisely⊗* 2 members.

7. The user asks why this is interesting.

8. SYSTEM: At the moment, we have the known fact f(x)≥y; try to pin down extreme cases,
those x for which f(x) is precisely equal to y. f is COUNT*TIMES⊗A-1⊗*, in the current
case, and y is 2.  Do you have any special names you would like me to refer to
COUNT * TIMES⊗A-1⊗* by? Any special names for the subset P?

9. USER: Call that composition FAC, standing for "pairwise factorings".
Call the set P, of extreme numbers under FAC, PRIMES.

11. AM notices that TIMES may also be viewed as operating on a MULTISET.
In this sense, TIMES⊗A-1⊗* would 
return all possible multisets (of any length) whose product is the given number. 
In this case, (5) just says that TIMES⊗A-1⊗*(x) has at least one member.
The extremes now are just 1 and PRIMES.   
AM requests a new name for this new interpretation of TIMES⊗A-1⊗*.

12. USER: Call it FACTORS.

13. SYSTEM: Conjecture: FACTORS(x) always seems to contain a bag containing only
ones and primes, and only one such bag. 

14. AM proves both existence and uniqueness by contradiction,
using the two lemmas it devises.

15. USER: Call this the unique factorization theorem. This is very important.
Consider now the sum of all the divisors of a number.

This leads AM to devise PERFECT NUMBERS.

.END

.SSEC(The Session Revisited: An In-Depth Example)

.TURN OFF "{}"

.SSSEC(The control structure reviewed)

Before examining how AM "really" could carry on the hypothetical dialogue,
let's review the global control structure of the system.
The basic process is to repeatedly (i) see which part of which BEING
is most relevant (w.r.t. CS, the current situation), 
then (ii) deal with that part of that BEING (either
⊗4execute⊗* it, or try to fill it in more). Often, new entities are
constructed$$ E.g., when filling in the  EXAMPLES slot of some  BEING *.
When this happens, they are evaluated for
interestingness. Any new, promising construct is temporarily 
given full "BEING" status:
we assume that all 25 parts might be worth filling in;
we try to answer all 25 "questions" for this new concept.
After a while, if it
doesn't prove interesting, we "forget"  this BEING.$$ Perhaps we
replace it as one subpart of one slot of one BEING, where it came from. This is
also where it would have remained, if it had never seemed interesting. *

Say we are dealing with part P of BEING B, which we  abbreviate B.P.
In the case of extending (or filling it in, if it is currently empty), AM simply
gathers together algorithms suitiable for filling in the slot named ⊗4<P or any
generalization of P>⊗* in the BEING named ⊗4<B or any generalization of B>⊗*.
One extreme
case of the latter generalization is 
"algorithms for filling in slot P in ⊗4any⊗* BEING";
though general, this is probably the most common kind of strategy information.
The reason is that most of the information for how to find the answer to some
given facet of a concept does ⊗4not⊗* depend on the kind of concept so much as
the kind of facet.
One of these general packets of knowledge was the ANY-BEING.EXAMPLES BEING we looked at.

.SSSEC(Looking at the new concept named "FACTORS")


Let us now consider a fairly detailed example. We shall pick up near the
end of our earlier dialogue, and explain how AM studies the new BEING FACTORS,
notices the unique factorization theorem, and tries to prove it.

.BEGIN FAS

After the user gives this Active Operation BEING the new name "FACTORS",
it is so recorded.
The definition part of FACTORS is already filled in, namely 
⊗6"FACTORS(x) ≡ {b | BAG(b) ∧ TIMES(b)=x ∧ 1¬εb}."⊗*
The intuition part already contains the adequate images of: partitioning a set
into disjoint subsets of equal size; dissection; looking more closely at an artifact
until it divides up into its constituent parts;
dividing up a pile of marbles into smaller piles which all balance each other.
The control function takes over. Recall it wants to repeatedly choose B and P, then
deal with part P of BEING B, abbreviated B.P.

.INDENT 0,3,0


1. Neither P nor B is known. Ask each BEING how relevant it is to the
current situation, CS.  Since CS contains many references to the recently-named
FACTORS BEING, and since that BEING is still quite incomplete, it is not surprising
that it wins this contest. To decide which part of FACTORS to deal with,
and how, we look
at FACTORS.ORDERING; it doesn't exist yet. We next look at
(generalization FACTORS).Ordering; in this case,
FACTORS is a kind of OPERATION, so we look at
OPERATION.ORDERING. 
This too is blank. Ultimately, we look at ANY-BEING.ORDERING, which has some
quite general hints for which order the parts should be dealt with.  In particular,
"concentrate on filling in the General Information parts before doing anything else."
These in turn are ordered "definition, intuition, examples, ties."
The definition and intuition parts are nicely filled out already.
The Examples part is blank, however, and that is where 
AM chooses to work now.

2. So AM has chosen B=FACTORS, P=EXAMPLES. AM is going to fill in the part
labelled FACTORS.EXAMPLES. The control system now goes about collecting relevant
algorithms (and constraints) from the parts labelled
[(generalization⊗A*⊗* FACTORS).(generalization⊗A*⊗* EXAMPLES)].FILLIN.
These parts are:
.BEGIN NOFILL TURN OFF "[]" TABS 32,62; TURN ON "\" INDENT 2; PREFACE 0
[FACTORS.Examples].Fillin\[FACTORS.Genl-Info].Fillin\[FACTORS.Any-Part].Fillin
[OPERATION.Examples].Fillin\[OPERATION.Genl-Info].Fillin\[OPERATION.Any-Part].Fillin
[ACTIVE.Examples].Fillin\[ACTIVE.Genl-Info].Fillin\[ACTIVE.Any-Part].Fillin
[ANY-BEING.Examples].Fillin\[ANY-BEING.Genl-Info].Fillin\[ANY-BEING.Any-Part].Fillin
.END

.ONCE PREFACE 0 INDENT 3

Of these 9 slots, only two have executable code:
[ANY-BEING.EXAMPLES].FILLIN contains almost a page of general techniques for
filling in examples. [ACTIVE.EXAMPLES].FILLIN has about half as many special-purpose
hints which apply whenever the entity is an Activity, ⊗4after⊗* the more general
techniques have been applied.
The control structure now simply appends these specific algorithms onto the
more general algorithms, and comes up with a little program to fill in 
FACTORS.EXAMPLES. This program is run, with (FACTORS,EXAMPLES) as its arguments,
and then control will revert to the "select best B,P" phase.

.ONCE PREFACE 1 INDENT 0,6,0
<IF TIME IS SHORT:>
Time doesn't permit me to go through the whole assembled program. It calls for
manipulating the definition and intuition parts of FACTORS to obtain examples.
In fact, it manages to compute FACTORS of the numbers 0,1,2, and 75.
<POINT TO 12:> Here, after these have been computed, they are entered on
the FACTORS.EXAMPLES part. AM then looks at the OPERATIONS part of FACTORS,
and finds various activities listed. For each one of these, it consults its
EXAMPLES part. If it's incomplete, it leaves a little note there saying
"if you want examples of your domain elements, one sure place to find some is on
FACTORS.EXAMPLES." Here we see an instance of how adding knowledge to AM
makes later activity run faster, not slower.
<SKIP THE REST OF THIS SECTION; GO TO NEXT SECTION>


<IF TIME IS NOT SHORT:>

3. Let us now go through this assembled program, and see how it fills in examples
for the FACTORS concept. The first thing it says to try is
"for each specialization x of FACTORS, take all of x.EXAMPLES if it is filled in ;
if not, try to instantiate the description of x and thereby produce such an example".
But the Specializations part of FACTORS hasn't been filled in at all, so there is
no x to work with. A note is placed in FACTORS.SPECIALIZATIONS, which says:
when you get filled in, instantiate yourself into an example for FACTORS.EXAMPLES".
This concludes the first of about 30 tactics listed under ANY-BEING.EXAMPLES.FILLIN.

4. The second thing to try is "instantiate the definition of FACTORS".
The definition is:
⊗6"FACTORS(x) ≡ {b | BAG(b) ∧ TIMES(b)=x ∧ 1¬εb}."⊗*
To instantiate it, AM must find a specific number x and a set of bags {b}, 
satisfying
the conditions. Since we know TIMES⊗A-1⊗*(2)={(1,2),(2,1)}, it is clear that
FACTORS(2) = {(BAG 2)}.
This is our first real example.

5. The next hint says "instantiate the intuition".  Suppose we look at the
image "partition a set into disjoint subsets of equal size". 
To instantiate it, take a specific
set, say {a,b,c,d,e,f}. 
By enumeration, the only ways to do this are: one subset with 6 elements,
two subsets each with 3 elements, three subsets each with 2 elements, and
6 subsets each with 1 element. The intuition directs the translation of these
results into the statement that the only bags whose TIMES is 6 are:
(BAG 1 6), (BAG 2 3), (BAG 3 2), (BAG 6 1). Eliminating 1's and multiple
occurrences of the same bag, we can say FACTORS(6) = {(BAG 6),(BAG 2 3)}.
This is our second example.

6. The next thing to try is "find an analogy between FACTORS and some other
BEING, say x, and then try to map over the examples of x into examples of FACTORS."
There are no such analogies known at present, so this fails.

7. The next type of example to search for is coincidental: some of the variables in
the definition might coinicide. But the only variables here are x and b, one of
which is a number and the other a bag, so no coincidence is possible.

8. Search for the simplest possible examples. Plug the extreme cases of the domains
into the definition. Thus we ask the BEING named NUMBERS what his extreme examples
are. He replies "0 and 1". AM now tries to compute FACTORS(0). This means finding
a bag satisfying TIMES(b)=0. AM asks the BEING named TIMES if he knows any such
b's. TIMES.EXAMPLES replies Yes; in fact, ⊗4any⊗* bag b is allowed if it contains
a ZERO. Thus FACTORS(0) = ⊗6{b | BAG(b) ∧ 0εb ∧ 1¬εb}⊗*.
Finding FACTORS(1) again involves asking TIMES, who says that the only bags whose
image under TIMES is 1 are those consisting only of ONE's. But there is no bag
of 1's not containing a 1, except for the empty bag. So FACTORS(1)={(BAG)}.

9. Search for a sophisticated, huge example. Plug in a sophisticated case of the
argument of FACTORS. AM asks NUMBER.EXAMPLES for a big example, and gets back
(in canonical form) the number 75. After much trial and error, this is finally
converted into the example: 
FACTORS(75)={(BAG 3 5 5), (BAG 75), (BAG 5 15), (BAG 3 25)}.

10. No prototypical example can be found, because so little is known about FACTORS.

11. No boundary examples can be found, since the concepts involved don't have meaningful
boundary examples.

12. At this point, our assembled program tells us to insert a mesage into each
slot of the form <member of FACTORS.OPERATIONS>.EXAMPLES, saying "if you want
examples of your domain elements, one sure place to find some is on FACTORS.EXAMPLES".
Here we see an example of how adding knowledge to AM will make later processing
⊗4shorter⊗*, not longer: forestalling a future search by leaving the proper information
in a place where it will be found immediately -- but only -- when later needed.

13. That was the final hint taken from [ANY-BEING.EXAMPLES].FILLIN; AM now turns 
to those on [ACTIVE.EXAMPLES].FILLIN.  
None of these turns out to be applicable here, so the assembled program is finished,
AM has dealt with FACTORS.EXAMPLES, and control reverts to choosing which B,P to
deal with next.

.SSSEC(Conjecturing the Unique Factorization Theorem)

THE CONJECTURING PROCESS: abbreviated

(See commentary immediately following the abbreviation)

.BEGIN SELECT 6 FILL SINGLE SPACE PREFACE 1 INDENT 0,3,0 TURN OFF "{}-"

1. Choose B=FACTORS, P=TIES.
Gather relevant algorithms from the slots labelled:
[FACTORS.Ties].Fillin, [FACTORS.Genl-Info].Fillin, [FACTORS.Any-Part].Fillin,
[OPERATION.Ties].Fillin, [OPERATION.Genl-Info].Fillin, [OPERATION.Any-Part].Fillin,
[ACTIVE.Ties].Fillin, [ACTIVE.Genl-Info].Fillin, [ACTIVE.Any-Part].Fillin,
[ANY-BEING.Ties].Fillin, [ANY-BEING.Genl-Info].Fillin, [ANY-BEING.Any-Part].Fillin.

2. "Let D be the
known BEING representing 
the kind of entity in the
range of FACTORS. Then ask D.INTEREST  how to look for interesting
properties or regularities.  If sparse, ask (generalization⊗A*⊗* D).INTEREST also.
Apply these methods to the output of a typical example of FACTORS.
Check interesting property found, by seeing if it holds
for the other outputs from FACTORS  and ensuring it isn't simply part of the
definition of FACTORS."

3. Ask SET.INTEREST and STRUCTURE.INTEREST for perceptual guidance about
a particular output,
say the output
{(BAG 3 5 5) (BAG 75) (BAG 5 15) (BAG 3 25)} from the call FACTORS(75).

4. "Three
distinct ways a structure  S can be interesting:
S satisfies a known interesting property of Type(S).
Every xεS satisfies 
some single,interesting property.
⊗4Some⊗*  xεS
satisfies some very interesting property."

5. First and second hints fail.
Now look at each element in turn, that is, each
bag. First  consider (BAG 75). This satisfies the property SINGLETON.
Checks with other examples of FACTORS.
Conjecture:
∀x.(BAG x)εFACTORS(x).

6.
Look at (BAG 3 5 5).  Each element satisfies PRIME.
All other examples of FACTORS check.
Conjec: FACTORS(x) always contains a bag of primes.

7. 
Look at (BAG 3 5 5) still.
Each element satisfies ODD. Does not check out:
FACTORS(2) = {(BAG 2)}. 

8. Look at (BAG 5 15) and at
(BAG 3 25). Nothing interesting.

9. Jumping ahead:
⊗6PF(x) = FACTORS(x) ∩ MAPSTRUC(PRIME, x) = {b | BAG(b) ∧ TIMES(b)=x ∧ 1¬εb ∧
∀zεb. PRIME(z)}⊗*.
Conjecture: PF is a function.

.END

COMMENTARY on the Conjecturing abbreviation

0. A special heuristic, embedded immutably in the control structure, has
AM make a slight effort to deal with the same B ⊗4or⊗* P as it did last
time, but not the same B.P of course. It is not surprising then that
FACTORS is again selected; this time, the TIES part is chosen to be filled in.
So AM is looking for some conjectures involving the concept FACTORS.

1. Information is gathered as before; this time, the relevant parts are:
.BEGIN NOFILL TURN OFF "[]" TABS 33,63; TURN ON "\" INDENT 3; PREFACE 0
[FACTORS.Ties].Fillin\[FACTORS.Genl-Info].Fillin\[FACTORS.Any-Part].Fillin
[OPERATION.Ties].Fillin\[OPERATION.Genl-Info].Fillin\[OPERATION.Any-Part].Fillin
[ACTIVE.Ties].Fillin\[ACTIVE.Genl-Info].Fillin\[ACTIVE.Any-Part].Fillin
[ANY-BEING.Ties].Fillin\[ANY-BEING.Genl-Info].Fillin\[ANY-BEING.Any-Part].Fillin
.END

2. just read through


3. Because the output of a call on FACTORS is a ⊗4set⊗* of bags,
we are directed to ask SET.INTEREST for aid
in perceiving interesting things about a particular output,
say the output
{(BAG 3 5 5) (BAG 75) (BAG 5 15) (BAG 3 25)} from the call FACTORS(75).
SET.INTEREST is not very big, so we ask STRUCTURE.INTEREST as well.

4. STRUCTURE.INTEREST explains that there are three 
distinct ways a structure can be interesting.
First, check whether the structure satisfies any known interesting property of that
type of structure.
If not, check to see whether every element satisfies 
the same interesting property. If not, check to see if ⊗4some⊗* element of the
structure satisfies some very interesting property.
The criteria for interestingness being talked about here is the one specified
by the BEING representing the type of the elements. In our present case,
our set is a set of ⊗4bags,⊗* so that
means consult all the hints and factors present under BAG.INTEREST. But this is
also very sparse, hence we recursively turn to 
STRUCTURE.INTEREST for evaluation criteria.

5. After a reasonable time, AM cannot find any interesting property satisfied by the
given output set.
 It also fails to find any single interesting property satisfied
by all four bags which form the elements of that output set.

Now AM looks at each element in turn, that is, each
bag. First we consider (BAG 75), say. This satisfies the property SINGLETON.
We check with other examples of FACTORS and, sure enough, each one of them
contains, as an element, a bag having the property SINGLETON. Comparing these
singletons to the inputs to FACTORS, we conjecture that 
(BAG x) will always appear in the output set of FACTORS(x).

6. We go back to looking at the individual bags in FACTORS(75).
This time we
look at (BAG 3 5 5).  Each element ⊗4does⊗* satisfy an interesting property,
namely PRIME. We check against the other examples of FACTORS, and sure enough
each one of them contains an element which is a bag of primes. There doesn't
seem to be any obvious relationship to the input argument, so we merely
conjecture that FACTORS(x) always contains a bag of primes, without saying
which primes or how to compute them. This is one half of the Unique Factorization
Theorem. Notice that this is "rough around the edges", namely for the cases
of factors of zero and one, but these will be caught later by an expert BEING
who specializes in checking conjectures just before we start to prove them.

7. Each element of (BAG 3 5 5) also satisfies the property ODD. But this is quickly
rejected by looking at the example FACTORS(2) = {(BAG 2)}.

8. We now look at the next individual bag in FACTORS(75), namely (BAG 5 15).
Nothing interesting is found here or in the next case, (BAG 3 25).

9. Before going on to prove some of these conjectures, let's see how AM might
notice the uniqueness aspects of them. AM knows that some elements of FACTORS(x)
satisfy MAPSTRUC(PRIME), but some don't. It wants to find out how to characterize
those which do; namely, those bags of primes from those containing a nonprime.
So AM will temporarily create a new BEING, called say PF, defined as
⊗6PF(x) = FACTORS(x) ∩ MAPSTRUC(PRIME, x) = {b | BAG(b) ∧ TIMES(b)=x ∧ 1¬εb ∧
∀zεb. PRIME(z)}⊗*.
Which means: all bags of primes whose TIMES is x; which also means all
factorizations of x into bags containing only primes.

In a manner similar to the above, AM will notice that PF of each number seems to
be a SINGLETON. That is, there is only one bag of primes in the FACTORS(x) collection
for a given x. 
The unique factorization theorem can now be
consisely be stated as "PF is a function defined on N".
In such a form, it is not surprising that AM will routinely investigate it.

.SSSEC(Proving Existence)

THE PROVING PROCESS abbreviated

(See commentary immediately following the abbreviation)

.BEGIN SELECT 6 FILL SINGLE SPACE PREFACE 1 INDENT 0,3,0 TURN OFF "{}-"

1. B = UFT, P= Justification.

2. 
[ANY-BEING.JUSTIFICATION].FILLIN
says to execute
GUESS.ALGORITHMS and PROVE.ALGORITHMS.

3. 
"Supporting Contacts".
Test UFT on extreme cases. 
Means select a number x, then comput FACTORS(x), then check
that one of the bags so produced consists purely of primes.
NUMBER says
that the extremes are 0, 1, and sometimes 2. Failure of 0,1.
Modify the wording of UFT:
⊗6∀numbers x>1, ∃ bag bεFACTORS(x) s.t. ∀zεb.PRIME(z).⊗*

4. PROVE.ALGORITHMS passes off to
Math-Induction.ALGORITHMS. 
Base case x=2; constructor SUCCESSOR. 
Base case already solved.

5. Assume UFT for all number between 2 and x.
Problem: prove
UFT(SUCCESSOR(x)).
Not trivial transformation.
Intuition indicates: proof by cases, where cases are:
SUCCESSOR(x) is/isn't prime.

6. 
Assume x+1 is a prime. 
Lemma: ⊗6∀x>2,(BAG x)εFACTORS(x).⊗*

7. 
Assume x+1 is not prime.
Write  x+1  as TIMES(u,v), 
where both  u and v are between 1 and x+1.
Work backwards; Lemma:
For any number a, bag b of numbers, bag d of numbers, if a=TIMES(b) and aεd,
then TIMES(d) = TIMES( SUBSTITUTE(b for a in d)).

8. Proofs of the two lemmas...

.END

COMMENTARY on the Proving abbreviation

There are many good automated theorem provers around, many of them embodying
some of the heuristics that AM does. So it's probably not worth laboring on
through the proof of the UFT.
The interesting aspect of AM is that it ⊗4proposed⊗* the conjecture.

Let's just go a step into this, to see how AM cleans up the (slightly-incorrect)
statement of UFT which was proposed as a conjecture.
This conjecture is made into a temporary BEING,call it UFT, and eventaully AM gets around
to filling in its JUSTIFICATION part.

1. Skipping some of the details, assume that the control structure chooses to
work on filling in the JUSTIFICATION part of that new conjecture.

2. The plan for action is assembled
out of the nonblank slot labelled: [ANY-BEING.JUSTIFICATION].FILLIN, which in turn
directs our energies to following the activities specified by
GUESS.ALGORITHMS and PROVE.ALGORITHMS.

3. The only hint on GUESS.ALGORITHMS which affects UFT is the one called
"Supporting Contacts". It says that we should try to test the conjecture
on extreme cases. By asking UFT, we find that testing it in a specific
case means selecting a number x, then computing FACTORS(x), then checking
that one of the bags so produced consists purely of primes.
To find extreme cases means therefore to find extreme numbers. NUMBER says
that the extremes are 0, 1, and sometimes 2. Sure enough, zero and one fail
the UFT conjecture! The hint on GUESS.ALGORITHMS says that if only extreme
cases fail, to simply modify the statement of the conjecture to allow for these
exceptions. The definition of UFT now becomes:
⊗6∀numbers x>1, ∃ bag b of prime factors⊗*

4. We now execute the code found under PROVE.ALGORITHMS, with UFT as our argument.

This is where we shall leave our proof example. 

<IF TIME IS SHORT, SKIP TO NEXT SECTION!!>

This sets up some demons to watch for places to declare lemmas, use intuition,
etc. Then each type of proof$$ Universal, Existential, Forward, Backward,
Direct, Indirect, Constructive, Nonconstructive, Mathematical Induction,... *
(which are themselves BEINGs) examine UFT to see how relevant they are.
The winner is, say, Proving-Universal-Statments (abbreviated Univ).
His ALGORITHMS part says to consider Mathematical Induction first.
The Math-Induction BEING's ALGORITHMS part says to clearly decide on the
base case and the constructor function. In the case of UFT, this is simply
the case x=2 and the function SUCCESSOR. A big boost in confidence occurs
when Math-Induction finds the base case has just been solved elsewhere
(UFT was verified for the case x=2 when we looked at the extremes 0,1,2).

5. The assumption now made is that UFT is true for x and all its predessors
(down to the number 2, though not below). The problem is now to prove that
UFT holds when x is replaced by SUCCESSOR(x), call it y.
After some effort, AM gives up trying to establish this in any trivial way.
The intuition for UFT should provide the hint that UFT would be true for y
either becuase y is a fundamental building block, or if not then
y must be built out of blocks much smaller than y.
This translates as a proof by cases, and PROOF-BY-CASES tries to run.
The two cases suggested by intuition are: SUCCESSOR(x) is/isn't prime.

6. The first of these subtasks is started by assuming that UFT is true for
2,3,...,x, and that x+1 is a prime. But the only conjecture that filling in
FACTORS.TIES provided, besides UFT, was the one saying that (BAG x) is
an element of FACTORS(x). So in our present case (BAG x+1) would be a singleton
bag, whose only element was a prime, and which was found in FACTORS(x+1).
Thus we set up, as a lemma, the fact that ⊗6∀x>2,(BAG x)εFACTORS(x).⊗*

7. The second subproblem is started by assuming that UFT is true for
2,3,..,x, and that x+1 is not a prime. Intuitively, that means that x+1 can
be split into two very small pieces, each much smaller than x, and hence for
whom UFT holds. One of the known properties of PRIME is that
⊗4not⊗* being prime lets you conclude that the number is the product of two 
which are smaller but not as small as 1. 
In the present case, noting that we again need to ensure x>1,
we are thus able to  write  x+1  as TIMES(u,v), 
where both  u and v are between 1 and x+1.
To use this fact, we must work backwards and observe that what we need is
the lemma:
For any number a, bag b of numbers, bag d of numbers, if a=TIMES(b) and aεd,
then TIMES(d) = TIMES( SUBSTITUTE(b for a in d)).
Thus we can express x+1 as TIMES((BAG u v)), but by the inductive hypothesis
u=TIMES(r) and v=TIMES(s), for some two bags of primes r,s. By the lemma,
we can replace u and v by the elements of r and s, get a new large bag of primes,
and TIMES applied to this large bag is still x+1.

8. The two lemmas used in this proof are set up as new BEINGs, and their justifications
will eventually be attended to. The first one is proved so easily that it is not
even mentioned to the user; the second one is interesting in its own right as well
as longer to prove, hence it is mentioned and remembered. The proofs of these
lemmas, and the proof of the uniqueness half of the UFT, will be omitted here.
Notice that in all this processing, just a few lines have been printed out to the
user, and there has been no guidance from him whatsoever.

.END

.SSEC(A Few Other Examples)

Let us skim over a few situations and see 
how AM would handle them, how it would discover
some of the interesting information which we won't supply it with initially.


.SSSEC(Effort of Noticing New Analogies)

AM will have a general strategy, located in the ORDERING part of the
BEING named ANY-BEING, which says that if no effort has been expended whatsoever to try
to find analogues of the current BEING, then there is a high interest in doing this
activity (about the same motivation as finding examples of it, 
if there are none yet).

This will lead to two distinct types of behavior. When the system is first started,
whichever BEING is chosen to be completed by COMPLETE, its Analogy subpart of its
Ties part will be blank, hence early on it will want to find analogies to itself
(it will trigger Analogize, with itself as the only known argument). The second
type of activity occurs when a new BEING is created by the system. It will usually
be worked on almost immediately, and after the highest-priority parts are filled in
(like intuition, definition, perhaps some examples and family ties) the above-mentioned
strategy will direct attention to finding analogies to it. 

When a BEING wants to find its analogues, Analogize will look within BEING's family, scanning
for another BEING which has one of: (i) syntactically similar definition,
(ii) intuition view which applies to the same real-world situation, (iii) syntactically
similar examples, (iv) similarities between the two BEINGs' Ties parts.

The initial flurry of analogy quests will number about  18,000
( = 5 families  x  30 BEINGs/family  x 30 BEINGs
to interact with  x 4 part-pairings to examine)
Some of these will be precluded almost instantly, so a reasonable figure is about
three CPU hours of time expended, finding about 1,000 possible analogies in toto,
of which only about 100 will prove intersting upon careful examination and will be
made into new BEINGs. Another speedup will occur because many of the initially supplied
BEINGs will not have anything in their Examples or Ties parts to begin with,
so those matches will fail trivially.

The secondary process of analogizing, when a new Ties, Ex, Defn, or Intu part is
added, is of course only a matter of  30 ( = 1 same family  x  30 BEING's to consider 
x 1 same part as the new one) things to look at.

.SSSEC(Filling in the Examples parts of Objects)

.TURN OFF "{}"

Let us now consider a fairly detailed example. What happens when the
system first starts up?  Each BEING will probably only have a few parts,
hence all will clamor for attention. The ORDERING part of ANY-BEING
indicates that after the definition and intutition, the next most
important part to fill in is Examples.  The reason is, both for
motivation and for later empirical evidence. 

.ONCE TURN ON "{}"
The environment function Complete takes over; see page {QP2}. The
numbers below refer to the steps listed there. 
The details of each access are omitted, for brevity.

1. Neither P nor B is known. Ask each BEING how relevant it is to the
current situation, CS, which at the moment is almost totally null.
Since most BEING's require ⊗4some⊗* constrained state of CS, in which
something is true or within certain bounds, there are only about
thirty responders (out of about 125 BEING's).  These include Structures
(who want examples), Actives (who want examples and ties),
Static-Metas (examples), and a couple Active-Metas (Guess,
Analogize)
who also want some examples.
The Time component of Analogize's Worth part is
incredibly low (since no arguments -- suggesions for either candidate
in the analogy -- have been proposed). Any of the Actives, and also
Guess, would first ensure that the Examples parts of the things it
deals with be filled in, hence we may as well assume that we are
filling in the Examples part of a Structure or a Static Meta.  The
latter category are not as easy, and are generally produced by an
Active Meta's direct command.  
Decide to work on structures.
Of all the structures, Set and Bag are
tied with the higest ORD value (based on their Worth parts).  The
list (Set Bag) is printed to the user, who then has a few seconds to
respond before the system begins working on one of them.  Say the user
doesn't care, and B is now determined to be Set. 
The next step is to choose P, the part of the Set BEING to work on now.
We collect all the
facts on up⊗A*⊗*(Set).ORDERING, which means Set.Ordering, Structure.Ord,
Object.Ord, AnyBEING.Ord. In this case, only the last of these is
nonempty. Each factor is evaluated, and Examples wins with a factor
of .6 on a 0 - 1 scale.   So P is chosen to be Examples.

2. Create a plan for filling in Set.Examples.  Collect any helpful
information from the following sources: Examples.Fillin (which
contains many things to try to get new examples), Set.Examples.Fillin
(nothing there), Structure.Examples.Fillin (which contains some
specialized hints: Convert other structures' Examples; make some of
the interestingness features present in up⊗A*⊗*(Set).Interest not just
desirable but actually required, thereby guaranteeing an
⊗4interesting⊗* example), Object.Examples.Fillin (empty),
AnyBEING.Examples.Fillin (empty).  Finally, some additonal parts of the
Examples BEING might be relevant later on.  The plan is simple: try the
Examples.Fillin activities, then the second Structure.Examples.Fillin
activity, then check the results with the Examples.Check part. 

3/4. Try to instantiate specializations of Set.  There are none.
Fail. 

3/4. Try to instantiate definition(s) of Set.  A simple linear
analysis of the base step of the recursive definition yields the fact
that (CLASS ), called PHI, is an example of a set. Return (CLASS ). 

3/4. Try to instantiate and apply the intuition(s). The set intuition
requires a purpose (and often other sets); the intuition is not
designed to create sets from nothing for no purpose (if it were, this
would be highly suspect!). Thus this fails.

3/4. Try to invert the recursive definition, so it produces more
complicated examples.  In the current case, this is trivial. We
simply apply the algorithm: Take known sets and apply Set-insert to
them. To start out, we plug in the specific base-step set, namely
Phi. The result is (CLASS (CLASS ) ), usually written { {} }.  We
reapply the algorithm with one argument PHI, and get 
.BEGIN NOFILL WIDEN 1,7; PREFACE 0
[CLASS (CLASS (CLASS)) (CLASS )]; with both arguments equal to (CLASS (CLASS)), we obtain
[CLASS (CLASS (CLASS)) (CLASS (CLASS)) ]  =  { {{}}, {{}} }.
.END
There is
no reason just now to go on, since we have the algorithm, so we
return these few examples plus we also return the inverted recursive
definition. $$
Aside: We have no way of knowing, though, whether this process
always gives new sets. That is, Phi might equal Setinsert((Class
Phi),(class Phi))=(Class (Class Phi) Phi). One would have to do
inference, using some foundation axiom, to prove that x can never be
an element of x, just to prove that we have three distinct sets here;
the actual proof that the chain [x↓n↓+↓1 = Set-insert(x↓n, x↓n)] never
repeats is not trivial unless the axiom is phrased in just the right
way. The fact that this arose out of inverting a recursive
definition, however, strongly suggests that this algorithm will in
fact yield an infinite number of distinct sets if there are
infinitely many.
An indirect proof could now be proposed, namely assume that A,B,...,Z are the only
distinct sets which can exist, and then derive a contradiction.*

3/4. Tag the Examples parts of Set.Ops (namely Member, Containment, Get-Some-Member,
Equality, Set-insert, and Set-delete) as follows: Put a little note in
each of these parts, saying that Set.Examples contains some examples
of Domain elements for these operations. This will raise the level of
estimated worth of working on ⊗4their⊗* examples parts. $$ This is
one small example of how tangential knowledge speeds up, rather than
slows down, the acquisition of new information.*

3/4. Now we pass from the general Examples.Fillin strategies to the
Structure.Examples.Fillin strategies.  We must conjoin the Interest
properties as if they were requirements.  Set.Interest asks for the
elements of a set to be related by some other known, interesting
relation (besides being members of the same set).  Things which are
so related are located via their Ties parts, so the task is find some
Tied entities and make them the elements of a set. The first part is
done, say, by noticing the Ties part of Set itself, namely to other
structures, and the second part is done when Make-a-Set recognizes
its own relevance. The result is thus (CLASS Hist List Bag Oset Set
Ordered Pair). 
The ease with which this was done signals that it may be explosive, so we
don't pursue this method of set construction any further right now.

3/4. Structure.Interest says that the structure should be such that
certain interesting operations are doable to it efficently, without
going into detail about which operations.  The part Set.Operations
contains Member, Get-some-member, Subset, Equal, Setinsert,
Setdelete, and some invariance data.  After studying these
operations, it decides that PHI is the most efficient argument to
each and every one of them. This result, while trivial, is noticed as
a surprising (to the system) discovery, and may be sufficent to
ensure that PHI is made into a BEING itself, and its properties studied. 

5. The percentage of success factor in Set.Worth vector is
incremented (say from .9 to .91), its analogic utility factor
likewise (from .8 to .9).  There is not enough activation energy left
to pursue any more examples of Set just now. A marker is left here
indicating how much effort was spent, how the inverted recursive
definition can be used, and the hint of a conjecture about the
diversity of its results. 

1. Return and decide if Set is still the best BEING, and/or if Examples
is still the best part to concentrate upon. Probably, the Examples
part of Bag will be the highest priority part ot fill in at the
moment. In this manner, we may suppose in later examples that the
system has spent some time to collect examples of all the structures.

.SSSEC(Considering New Compositions of Operations)

The activity for finding examples of Actives is similar
to finding examples of Sets, except that
Structure.Examples.Fillin is not relevant to Actives, and its place is taken
by Active.Examples.Fillin, which contains one
specific activity: after a new specialized
operator is found, go to the trouble of applying it to  a few examples
of its domain. This gets a bit intricate if, e.g., its domain is itself
a set of operators. The trickiest case, when the Active is the BEING named Compose, is
presented now to help clarify (??) all this "examples of examples of..."  confusion.


1. Complete wants to determine P and B. Again the same thirty or so BEING's respond,
as in example 3. This time, we assume that the static BEING's have their examples filled
in, likewise most of the Actives. Guess and Analogize still are low in their 
run-time worth components (due to unspecified arguments).  Suppose that no examples
are known for the Active BEING named Compose, and its Worth part lets it be chosen.
Ordering (of Any-BEING, actually) specifies that Examples should be filled in.

2. Must devise a plan for filling in the Examples part of the Compose BEING.
Much information exists under Examples.Fillin, and some also under
Active.Examples.Fillin. The latter indicates "Afterwards", so it is done only
after the Examples.Fillin strategies are exhausted. The final information 
recognized to be relevant
is present
in Compose.Algorithms, and is used when the terms of the definition get replaced by
specific examples of themselves.

3/4.  Specialize the definition of Compose.
Must find an ordered pair of operations (f,g),  with dom(f) ⊃ ran(g).
Access the Find.Algorithms part.  This says to consider K={set of known operators}.
Form the cross-product C=KxK. If there is some reasonable ordering on C, order it.
Pick an e in C, say e=(j,k). Check whether ran(k) ⊂ dom(j).
If so, apply Compose.Algorithm. In either case, you can continue by picking
another element of C, etc.

By applying the above algorithm, the system uncovers a wealth of possible compositions.
For example, αα ⊗7o⊗* Delete, where αα can be any one of: Insert, Delete, Convert,
Subst, Unite, Common-parts, Member, Contain. Some second-order of worth compositions
include compose*(delete,delete)  and also equal*(delete,delete). Some third-order ones
are compose*(delete,insert) and equal*(delete,insert).

Each example found (and there will be about 300 of them) is made into a BEING whose
Worth part indicates high interest but a short lifetime if nothing new and interesting
is found out about it (if no interesting Tie is discovered rapidly).  
This is done by
Active.Examples.Fillin, which also specifically calls for the investigation of the
specific Ties of the form: (input,output) satisfy some known interesting
relation; see if
the new operation is related to another by yet a third.
The 300 new Active BEING's are chosen one at a time, 
and their intuition and examples parts are filled out. 

Let us take a few examples (not all 300!!). Consider Insert o Delete. This means
take a structure S, produce all the pairs (e, result of deleting e from S), then
call Insert on each of these pairs, getting a list of new structures
characterized by  {S' | ∃e. S' = result of inserting e into the result of deleting
e from S}.  Another view is to be given a structure S and an element e, then
perform Delete(e,S), obtaining structure R, then perform the (explosive) operation
Insert(R), yielding all the pairs (f, result of inserting f into R).
.NOFILL
Some examples are found, say PHI → {(e,PHI)} → { {e} };
{f} → { (e, {f}), (f, PHI) } → { {e,f}, {f} };
(f) → { (e, (f)), (f, NIL) } → { (e,f), (f) };
(f,g) → { (e, (f,g)), (f, (g)), (g, (f)) }  →  { (e,f,g), (f,g), (g,f) }.
.FILL
Filling in the intuition part of Insert o Delete, we get the tight meshing of:
pull x out of container S and then drop it back in. This should indicate
that Insert*Delete may leave the original container unchanged. That would mean that
Insert o Delete (S) = {S}; a slight weakening would be the statement
"S is a member of Insert o Delete (S)". One potential exception is when you can't
pull the thing x out of S; when S is null or (more intelligently) when the entity
x is not already a member of S.
Either the intuition or the examples should indicate a conjecture of the form
.NOFILL
S non-null  ↔  S ⊗6ε⊗* Insert o Delete(S), or perhaps even the sophisticated one:
x is in set S  ↔  Insert(x,Delete(x,S))=S.
.FILL

Consider now the composition Member o Delete, which takes a structure S,
deletes some element x, then sees if x is a member of S. Notice that the range
is therefore {T,F}. From examples alone, the system should notice that all sets
and osets map into False, and some-but-not-all lists and bags map the same way.
The intuition should answer the riddle "how/when does a list/bag ⊗4not⊗* map into
False?" The answer is that there was more than one x in the original structure,
so there is still at least one left when you pluck out one x. One way to ensure
that this ⊗4will⊗* occur is to insert x twice into the  list or bag
before you try to apply this composition. The conjecture thus arrived at is:
Member*Delete*Insert*Insert(S) contains T iff S is a bag or a list, and otherwise
(for sets and osets) it is {False}.  The user may name this property "duplicity".

The third example of a composition we shall deal with explicitly here is the awful
one Assign*Never.  First we take an unquantified proposition P, then turn out
ordered pairs (x, ⊗6∀x.¬⊗*P(x)) for all possible variable names x. For each such
pair (x,Q), we then assign to the variable x the value Q.  This bizarre operation
thus could map P="x is in S" to the situation where y had the value "for all y,
x is never in S", where x had the value "⊗6∀x.¬xεS⊗*", etc. 
After much groping, this 
might lead to distinguishing the positive quantifiers (⊗6∀,∃⊗*) from the
negative quantifiers (never and not always).
The intuitions will positively rebel at this unnatural composition, and the BEING will
be allowed to die soon. All the other compositions of the form "Assign o <quantify>"
will be noticed as
similar to this dismal failure, hence the system will not even waste this much time
on them.
Notice that there is nothing wrong with glancing at this horrible composition.
It is only upon examination that the intuitions are asked to ripple outward, hopefully
towards each other, until they intersect in some image. In this case, they don't
meet in any reasonable time, which is the computational equivalent of saying that
their composition is aesthetically disgusting. The only "stupidity" would be to
notice this mismatch and ignore it;  the common brand of "ignorance" would be never
to uncover this intuitive incompatibility.

.SSSEC(Proving a Conjecture)

Let us discuss the examination of the intuitively and empirically justified
conjecture first mentioned in example 4, on the last page:
For any non-null structure S, S is one  result of Insert*Delete(S).

The Test BEING grabs control, and since the conjecture is believed (intuitively
and empirically) he calls
on Prove. The first action under Prove.Algorithms is to clarify the 
existing informal justification.
The intuition was to break off a piece x from S and then glue it back into the
same place; to pull a thing out of a container and then throw it back in.
The examples  include many of each known type of structure.
Prove then asks which type of proving seems most relevant. Proving-⊗6∀⊗*'s is eager,
and narrowly wins over Cases and Natural Deduction. This is to no avail, for
Proving-⊗6∀⊗*'s immediately asks about Cases of structures anyway.
Structures.Specializations informs him that the types are Hist, List, Oset, Bag, Set.

Before working on separate case proofs, as much general-purpose proof as possible
should be firmed up. The intuition is asked to notice features about the element
which is deleted in the successful cases. It infers that the element was always a
member of the structure; it probably doesn't notice that it was the first member
in the ordered case (lists and osets and hists) and any member in the
unordered case (sets and bags). The second step, 
that of insertion, brings you back to
the original structure if you then glue it back in precisely the place you took it
from (ordered) or anywhere (non-ordered).
The Insertion BEING is asked where it puts the element, and its intuition replies that
it goes at the front of the structure, so that Some-member can grab it next.
The reasoning now is that if x is placed such that Some-member grabs it next,
and it had to be placed where it came from, then it had to come from the place which
Some-member would grab. That is, x had to be Some-member of S. The suggested proof is:
.NOFILL
x is Assigned the value Some-member(S).
S' is Assigned the value Delete(x,S).
S'' is Assigned the value Insert(x,S').
Claim S'' = S.
That is, S = Insert(Some-member(S), Delete(Some-member(S), S)).
.FILL
The general-structure axioms are insufficent to prove this, so we finally do break
the conjecture into cases, hopefully using the suggested proof as our model. For the
cases of Sets and Osets, this proof works fine. For the rest, however, the duplicity
simply confuses the issues and leads, e.g., to infinite chains of inductions which
simply don't get any easier. The core of this dilemma is the need to count the number
of occurrences of each element in a bag or list, which of course the system can't
do now. The two alteraatives are to defer this until later, but rely on it unless
proven otherwise, or to add new list/bag axiom(s) from which this conjecture 
could be deduced.  The former
is preferred, since it avoids interacting with the user,
and since Assuming is a way of "copping out".
So we postpone further attempts
at proving this until some new, powerful knowledge is gained relevant to
bags and lists.
A note is added in case any new bag or list axiom is later considered: its value is
boosted if it also helps prove this conjecture (in addition to its reason for
existing).

The next example goes into detail about a much more trivial conjecture.


.SSSEC(Formally Investigating an intuitively believed conjecture)

Note: It is difficult to find hard proofs at this low level.  
PHI=0={}=(CLASS )=empty set.
Below is what the user might see at near-maximal verbosity level.

.NOFILL

(1) Conjecture: The only relation from 0 to any set X is 0.

Test recognizes: conjecture 
     Intuitive Justification: Cannot seem to find any place for any arrow of the reln. to come
     from (i.e. can't draw arrow because can't choose an ele. from domain because there aren't any)
     Conclusion: since this is believed, we shall try to prove it, not disprove it.
     Access: A relation between A and B is a subset of A X B.
     Access: A X B is the set of all ordered pairs <a,b> such that a ⊗6ε⊗* A and b ⊗6ε⊗* B

Containment.Iden: To prove Any αα is BEING, consider any αα, show it's BEING.
     Consider any relation R: 0 → X.  Show it is 0.  Ask the BEING named PHI how to prove R=PHI.
     Answer: Show all subsets of 0 x X are 0; alternative: assume z is an ele, derive contradiction.
     Intuition: All subsets of a set are empty iff the set is empty. (Becomes a lemma.)
     Must show 0 x X = 0 for all sets X. This is intuitive.  (Becomes a lemma.) Done.

Prove: To prove p iff q, prove p implies q and q implies p.  To prove
     p implies q, assume p and the negation of q, and derive a contradiction.

     Now must prove two lemmas, by contradiction:
     (1) Say X is not empty but all its subsets are.  If X is not empty,
         there is some x ⊗6ε⊗* X.  If x ⊗6ε⊗* X then {x} ⊂ X. But {x} is not empty. Contradiction.
         Say X is empty but is has a non-empty subset Y.  If Y is non-
         empty, there is some y ⊗6ε⊗* Y.  By definition of subset, y ⊗6ε⊗* X.  Contradiction.

     (2) Access the definition of Cross-product to interpret 0xX. Any member z is of the
	 form (a,b), with a in 0 and b in X. But a can never be in 0. Contradiction.

Popping up, we discover that (1) is now proved.

Try to prove the converse of (1). Analogy with last proof (this will actually work) 

.GROUP
.SSSEC(Other potentially productive examples)

The following might be interesting to actually simulate by hand before programming:

Discovering and developing a particular analogy.
Discovering and developing the idea of the transpose of a relation.
Working out a complicated inductive proof.
.APART
.FILL

.NSECP(Parameters: Size / Timetable / Results so far)

.BEGIN NOFILL INDENT 0 PREFACE 0 TABS 36,45 TURN ON "\" SELECT 1

.SSEC(Parameters Characterizing the Magnitude of AM)

     NUMBER\Initially\Ultimately (but recall: there is no set goal)
Number of Families of BEINGs\  5\  5
Number of BEINGs per family\ 30\100
Number of Parts per BEING\ 25\ 25
Size of each part (lines of LISP)\  5\  7
Number of parts filled in\  8\ 20
Size of avg. BEING (LISP lines)\ 40\140
Total number of BEINGs\150\500
Core used by AM\200K\500K
Time per int. result (cpu min)\ 15\  5
CPU time used (hours)\  0\ 50
Human time used (person-hours)\300\1500
Dissertations completed\  0\  1
.E

.SSEC(Timetable for the AM Project)

(i). Codify the necessary core of initial knowledge 
(facts and the wisdom to employ them).
⊗7Reality: See Given Knowledge, as presented in a separate volume. 
Completed in December, 1974.⊗*

.BEGIN  TURN ON "{"

(ii). Formulate a sufficient set of new ideas, 
design decisions, and intuitive assumptions
to make the task meaninful and feasable.
⊗7Reality: firmed up in January, 1975.⊗*

(iii). Use these ideas to represent the core knowledge of mathematics collected 
in (i),
in a concrete, simulated system.
⊗7Reality: the current version of Given Knowledge casts this into the concept-family format.
Hand-simulations done during February, March, and April, 1975, 
with this "paper" system.⊗*

(iv). Implement a realization of AM as a  computer program.
⊗7Reality: Under way May 1, 1975.⊗*

(v). Debug and run the system. Add the natural language abilities gradually, as needed.
⊗7Reality: Scheduled for June to November of 1975. First interesting results
expected in late June.⊗*

(vi). Analyze the results obtained from AM, with an eye toward: overall
feasability of automating creative mathematical discovery and theory development;
adequacy of the initial
core of knowledge; adequacy of the ideas, design decisions, implementation
details, and theoretical assumptions.  Use the results to improve the system;
when "adequate,"  forge ahead as far as possible into as many domains as possible,
then reanalyze. ⊗7Reality: 
the (v)↔(vi) cycle will terminate in the Winter of 1976.⊗*

(vii). Experiment with AM. ⊗7Reality: scheduled to begin in December, 1975.⊗*

.END

.SSEC(Results as of 5/14/75)

The control structure has been written and debugged; it involves a priority
queue of candidates for AM's attention. The size is about 5 pages of
top-level routines( e.g., Find-new-candidates), 
and 5 pages of utility function (e.g., Dump-new-Beings-file).
A few BEINGs have been partially coded and introduced (notably AnyBeing.Examples
and also Set-structure). The system decided that it would be interesting to fill
in examples of sets, and came up with the sets:
(CLASS), (CLASS DOUG AVRA CORDELL ED), (CLASS (CLASS)), and
(CLASS R3-5 R3-6 R4-5 R4-6 R5-5 R5-6). These correspond to the empty set,
a set of usernames, the set containing the empty set, and a 2x3 rectangle
of integral lattice points.  This took a couple cpu seconds. Afterwards,
AM couldn't find anything interesting to do, and collapsed. The current
task is now to encode and feed in the BEINGs involving Relation,
instances of Relations, Composition;  that might be enough to allow AM to
"take off" -- to never run out of things to do, although the interest
level of concepts built just on these few basic ones cannot be very good.

.ASEC(Bibliography)

.ASSEC(Comparison to Other Systems)

One popular way to explicate a system's design ideas is to compare it to other,
similar systems, and/or to others' proposed criteria for such systems. There is
virtually no similar project known to the author, despite an exhaustive search
(see Bibliography). A couple tangential efforts will be mentioned, followed by
a discussion of how AM will measure up to the understanding standards set
forth by Moore and Newell in their MERLIN paper.
Next comes a listing of the books which were read, and finally a bibliography
of relevant
articles.

Several projects have been undertaken which comprise a small piece of the proposed
system, plus deep concentration on some area ⊗4not⊗* under study here. For example,
Boyer and Moore's theorem-prover embodies some of the spirit of this effort, but its
knowledge base is minimal and its methods purely formal.  Badre's CLET system worked
on learning  the decimal addition algorithm
$$ Given the addition table up to 10 + 10,
plus an English text description of what it
means to carry, how and when to carry, etc.* but the ⊗4mathematics discovery⊗*
aspects of the
system were neither emphasized nor worth emphasizing; it was an interesting natural
language communication study. 
The same comment applies to several related studies 
by IMSSS$$See [Smith], for example.*.
Gelernter has worked on using prototypical examples
as analogic models to guide search in geometry, and Bundy has used "sticks" to help
his program work with natural numbers.  
Kling has studied the single heuristic of analogy, and Brotz has written a
system which uses this to propose useful lemmata; both of these are set up as
theorem provers, again not as discoverers.
One aspect that each of these systems lacked
was size: they all worked in tiny toy domains, with miniscule, carefully prearranged
knowledge bases, with just enough information to do the job well, but not so much that
the system might be swamped. AM is open to all the advantages and all
the dangers of a non-toy system with a massive corpus of data to manage.  The other
systems did not deal with intuition, or indeed any multiple knowlege source (except
examples or syntactic analogy). 
Certainly none has considered the paradigm of ⊗4discovery and evaluation of
the interestingness of structure⊗*; the others have been "here is your task, try and
prove it,"  or, in Badre's case, "here is the answer, try and translate/use it."

There is very little thought about discovery in mathematics from an algorithmic
point of view; even clear thinkers like Polya and Poincare' treat mathematical 
ability as a sacred, almost mystic quality, tied to the unconscious.
The writings of philosophers and psychologists invariably attempt to examine
human performance and belief, which are far  more manageable than creativity
in vitro.  Belief formulae in inductive logic (eg., Carnap, Pietarinin) 
invariably fall back upon how well they fit human measurements. The abilities of
a computer and a brain are too distinct to consider blindly working for results
(let alone algorithms!) one possesses which match those of the other.

In an earlier section we discussed criteria for the system.
Two important criteria are final performance and initial starting point.
That is, what is it given (including the knowledge in the program environment),
and what does AM do with that information?  Moore and Newell have published some
reasonable design issues for any proposed understanding system, and we shall now
see how our system answers their questions$$
Each point of the taxonomy which they
provide before these questions is covered by the proposed system.*.

.BEGIN W(6) 

Representation: Families of BEINGs, simple situation/rules, opaque functions.
	Scope: Each family of BEINGs characterizes one type of knowledge. 
			Each BEING represents one very specialized expert.
			The opaque functions can represent intuition and the real world.
	Grain: Partial knowledge about a topic X is naturally expressed as an incomplete BEING X.
	Multiple representations: Each differently-named part has its own format, so, e.g.,
		examples of an operation can be stored as i/o pairs, the intuition points to an
		opaque function, the recognition section is sit/action productions, the
		algorithms part is a quasi-executable partially-ordered list of things to try.
Action: Most knowledge is stored in BEING-parts in a nearly-executable way; the remainder is
	stored so that the "active" segment can easily use it as it runs.  The place that
	a piece of information is stored is carefully chosen so that it will be evoked
	in almost all the situations in which it is relevant.  The only real action in the
	system is the selective completion of BEINGs parts (occasionally creating a new BEING).
Assimilation: There is no sharp distinction between the internal knowledge and the
	task; the task is really nothing more than to extend the given knowledge while
	maintaining interest and asethetic worth.  The only external entities are the
	user and the simulated physical world. Contact with the first is through a
	simpleminded translation scheme, with the latter through evaluation of opaque
	functions on observable data and examination of the results.
Accomodation: translation of alien messages; inference from (simulated) real-world examples data.
Directionality: The Environment gathers up the relevant knowledge at each step to fill
	in the currently worked-on part of the current BEING, simply by asking that part
	(its archetypical representative), that BEING, and its Tied BEINGs what to do.
	Keep-progressing: at each stage, there will be hundreds or thousands of unfilled-in
		parts, and the system simply chooses the most interesting one to work on.
Efficiency: 
	Interpreter: Will the contents of BEING's parts be compilable, or must they remain
		completely inspectable? One alternative is to provide two versions, one
		fast one for executing and one transparent one for examining. 
		Also provide access to a compiler, to recompile any changed (or new) part.
	Immediacy: There need not be close, rapidifire comunication with a human,
		but whenever communicating with him, time ⊗4will⊗* be important; thus the
		only requirement on speed is placed upon the translation modules, and
		they are fairly simple (due to the clean nature of the mathematical domain).
	Formality: There is a probabilistic belief rating for everything, and a descriptive
		"Justifications" component for all BEINGs for which it is meaningful.
		There are experts who know about Bugs, Debugging, Contradiction, etc.
		Frame problem: when the world changes, make no effort to update everything.
			Whenever a contradiction is encountered, study its origins and
			recompute belief values until it goes away.
Depth of Understanding:  Each BEING is an expert, one of whose duties is ∧o announce his
	own relevance whenever he recognizes it. The specific desire will generally
	indicate which part of the relevant BEING is the one to examine. In case this loses,
	each BEING has a part which (on the basis of how it failed) points to alternatives.
	Access to all implications: The intuitive functions must simulate this ability,
		since they are to be analogic. The BEINGs certainly don't have such access.

.END

.ASSEC(Books and Memos)

All the references below have actually been read as background for AM.
They form what I hope is a comprehensive list of publications dealing
with automated theory formation and with how mathematicians do research.$$
From a single author or project (e.g., DENDRAL), only one or two
recent papers will be listed.*
I strongly recommend those with an "α@" sign; the others proved to be merely
supplementary.

.BEGIN FILL SINGLE SPACE  PREFACE 1 INDENT 0,4,0 TURN OFF "@"; ONCE PREFACE 2

Adams, James L., ⊗4Conceptual Blockbusting⊗*, W.H. Freeman and Co.,
San Francisco, 1974.

Allendoerfer, Carl B., and Oakley, Cletis O., ⊗4Principles of
Mathematics⊗*, Third Edition, McGraw-Hill, New York, 1969.

Alexander, Stephen, ⊗4On the Fundamental Principles of Mathematics⊗*,
B. L. Hamlen, New Haven, 1849.

Aschenbrenner, Karl, ⊗4The Concepts of Value⊗*, D. Reidel Publishing
Company, Dordrecht, Holland, 1971.

Atkin, A. O. L., and Birch, B. J., eds., ⊗4Computers in Number Theory⊗*,
Proceedings of the 1969 SRCA Oxford Symposium, Academic Press, New York, 
1971.

Avey, Albert E., ⊗4The Function and Forms of Thought⊗*, Henry Holt and
Company, New York, 1927.

@Badre, Nagib A., ⊗4Computer Learning From English Text⊗*, Memorandum
No. ERL-M372, Electronics Research Laboratory, UCB, December 20, 1972.
Also summarized in ⊗4CLET -- A Computer Program that Learns Arithmetic
from an Elementary Textbook⊗*, IBM Research Report RC 4235, February
21, 1973.

Bahm, A. J., ⊗4Types of Intuition⊗*, University of New Mexico Press,
Albuquerque, New Mexico, 1960.

Banks, J. Houston, ⊗4Elementary-School Mathematics⊗*, Allyn and Bacon,
Boston, 1966.

Berkeley, Edmund C., ⊗4A Guide to Mathematics for the Intelligent
Nonmathematician⊗*, Simon and Schuster, New York, 1966.

Berkeley, Hastings, ⊗4Mysticism in Modern Mathematics⊗*, Oxford U. Press,
London, 1910.

Beth, Evert W., and Piaget, Jean, ⊗4Mathematical Epistemology and
Psychology⊗*, Gordon and Breach, New York, 1966.

Black, Max, ⊗4Margins of Precision⊗*, Cornell University Press,
Ithaca, New York, 1970.

Blackburn, Simon, ⊗4Reason and Prediction⊗*, Cambridge University Press,
Cambridge, 1973.

Bongard, ⊗4Pattern Recognition⊗*, 

@Brotz, Douglas K., ⊗4Embedding Heuristic Problem Solving Methods in a
Mechanical Theorem Prover⊗*, dissertation published as Stanford Computer
Science Report STAN-CS-74-443, AUgust, 1974.

Bruner, Jerome S., Goodnow, J. J., and Austin, G. A., ⊗4A Study of
Thinking⊗*, Harvard Cognition Project, John Wiley & Sons,
New York, 1956.

Charosh, Mannis, ⊗4Mathematical Challenges⊗*, NCTM, Wahington, D.C., 1965.

Cohen, Paul J., ⊗4Set Theory and the Continuum Hypothesis⊗*,  W.A.Benjamin, Inc.,
New York, 1966.

Copeland, Richard W., ⊗4How Children Learn Mathematics⊗*, The MacMillan
Company, London, 1970.

Courant, Richard, and Robins, Herbert, ⊗4What is Mathematics⊗*, 
Oxford University Press, New York, 1941.

D'Augustine, Charles, ⊗4Multiple Methods of Teaching Mathematics in the
Elementary School⊗*, Harper & Row, New York, 1968.

Dornbusch, Sanford, and Scott, ⊗4Evaluation and the Exercise of Authority⊗*,
Jossey-Bass, San Francisco, 1975.

Douglas, Mary (ed.), ⊗4Rules and Meanings⊗*, Penguin Education,
Baltimore, Md., 1973.

Dowdy, S. M., ⊗4Mathematics: Art and Science⊗*, John Wiley & Sons, NY, 1971.

Dubin, Robert, ⊗4Theory Building⊗*, The Free Press, New York,  1969.

Dubs, Homer H., ⊗4Rational Induction⊗*, U. of Chicago Press, Chicago, 1930.

Dudley, Underwood, ⊗4Elementary Number Theory⊗*, W. H. Freeman and
Company, San Francisco, 1969.

Eynden, Charles Vanden, ⊗4Number Theory: An Introduction to Proof⊗*, 
International Textbook Comapny, Scranton, Pennsylvania, 1970.

Fuller, R. Buckminster, ⊗4Intuition⊗*, Doubleday, Garden City, New York,
1972.

Fuller, R. Buckminster, ⊗4Synergetics⊗*, ...

GCMP, ⊗4Key Topics in Mathematics⊗*, Science Research Associates,
Palo Alto, 1965.

George, F. H., ⊗4Models of Thinking⊗*, Schenkman Publishing Co., Inc.,
Cambridge, Mass., 1972.

Goldstein, Ira, ⊗4Elementary Geometry Theorem Proving⊗*, MIT AI Memo 280,
April, 1973.

Goodstein, R. L., ⊗4Fundamental Concepts of Mathematics⊗*, Pergamon Press, 
New York, 1962.

Goodstein, R. L., ⊗4Recursive Number Theory⊗*, North-Holland Publishing Co.,
Amsterdam, 1964.

@Green, Waldinger, Barstow, Elschlager, Lenat, McCune, Shaw, and Steinberg,
⊗4Progress Report on Program-Understanding Systems⊗*, Memo AIM-240,
CS Report STAN-CS-74-444,Artificial Intelligence Laboratory,
Stanford University, August, 1974.

@Hadamard, Jaques, ⊗4The Psychology of Invention in the Mathematical
Field⊗*, Dover Publications, New York, 1945.

Halmos, Paul R., ⊗4Naive Set Theory⊗*, D. Van Nostrand Co., 
Princeton, 1960.

Hanson, Norwood R., ⊗4Perception and Discovery⊗*, Freeman, Cooper & Co.,
San Francisco, 1969.

Hartman, Robert S., ⊗4The Structure of Value: Foundations of Scientific
Axiology⊗*, Southern Illinois University Press, Carbondale, Ill., 1967.

Hempel, Carl G., ⊗4Fundamentals of Concept Formation in Empirical
Science⊗*, University of Chicago Press, Chicago, 1952.

Hibben, John Grier, ⊗4Inductive Logic⊗*, Charles Scribner's Sons,
New York, 1896.

Hilpinen, Risto, ⊗4Rules of Acceptance and Inductive Logic⊗*, Acta
Philosophica Fennica, Fasc. 22, North-Holland Publishing Company,
Amsterdam, 1968.

Hintikka, Jaako, ⊗4Knowledge and Belief⊗*, Cornell U. Press, Ithaca, NY, 1962.

Hintikka, Jaako, and Suppes, Patrick (eds.), ⊗4Aspects of Inductive
Logic⊗*, North-Holland Publishing Company, Amsterdam, 1966.

Jouvenal, Bertrand de, ⊗4The Art of Conjecture⊗*, Basic Books, Inc.,
New York, 1967.

@Kershner, R.B., and L.R.Wilcox, ⊗4The Anatomy of Mathematics⊗*, The Ronald
Press Company, New York, 1950.

Klauder, Francis J., ⊗4The Wonder of Intelligence⊗*, Christopher
Publishing House, North QUincy, Mass., 1973.

Klerner, M., and J. Reinfeld, eds., ⊗4Interactive Systems for Applied Mathematics⊗*,
ACM Symposium, held in Washington, D.C., AUgust, 1967. Academic Press, NY, 1968.

Kline, M. (ed), ⊗4Mathematics in the Modern World: Readings from Scientific
American⊗*, W.H.Freeman and Co., San Francisco, 1968.

@Kling, Robert Elliot, ⊗4Reasoning by Analogy with Applications to Heuristic
Problem Solving: A Case Study⊗*, Stanford Artificial Intelligence Project
Memo AIM-147, CS Department report CS-216, August, 1971.

Koestler, Arthur, ⊗4The Act of Creation⊗*,  New York, Dell Pub., 1967.

Korner, Stephan, ⊗4Conceptual Thinking⊗*, Dover Publications, New York,
1959.

Krivine, Jean-Louis, ⊗4Introduction to Axiomatic Set Theory⊗*, Humanities Press,
New York, 1971.

Kubinski, Tadeusz, ⊗4On Structurality of Rules of Inference⊗*, Prace
Wroclawskiego Towarzystwa Naukowego, Seria A, Nr. 107, Worclaw, 
Poland, 1965.

Lakatos, Imre (ed.), ⊗4The Problem of Inductive Logic⊗*, North-Holland 
Publishing Co., Amsterdam, 1968.

Lamon, William E., ⊗4Learning and the Nature of Mathematiccs⊗*, Science
Research Associates, Palo Alto, 1972.

Lang, Serge, ⊗4Algebra⊗*, Addison-Wesley, Menlo Park, 1971.

Lefrancois, Guy R., ⊗4Psychological Theories and Human Learning⊗*, 1972.

Le Lionnais, F., ⊗4Great Currents of Mathematical Thought⊗*, Dover
Publications, New York, 1971.

Margenau, Henry, ⊗4Integrative Principles of Modern Thought⊗*, Gordon
and Breach, New York, 1972.

Martin, James, ⊗4Design of Man-Computer Dialogues⊗*, Prentice-Hall, Inc.,
Englewood Cliffs, N. J., 1973.

Martin, R. M., ⊗4Toward a Systematic Pragmatics⊗*, North Holland Publishing
Company, Amsterdam, 1959.

Mendelson, Elliott, ⊗4Introduction to Mathematical Logic⊗*, Van Nostrand Reinhold
Company, New York, 1964.

Meyer, Jerome S., ⊗4Fun With Mathematics⊗*, Fawcett Publications,
Greenwich, Connecticut, 1952.

Mirsky, L., ⊗4Studies in Pure Mathematics⊗*, Academic Press, New
York, 1971.

Moore, Robert C., ⊗4D-SCRIPT: A Computational Theory of Descriptions⊗*,
MIT AI Memo 278, February, 1973.

Nagel, Ernst, ⊗4The Structure of Science⊗*, Harcourt, Brace, & World, Inc.,
N. Y., 1961.

National Council of Teachers of Mathematics, ⊗4The Growth of Mathematical
Ideas⊗*, 24th yearbook, NCTM, Washington, D.C., 1959.

Newell, Allen, and Simon, Herbert, ⊗4Human Problem Solving⊗*, 1972.

Nevins, Arthur J., ⊗4A Human Oriented Logic for Automatic Theorem
Proving⊗*, MIT AI Memo 268, October, 1972.

Niven, Ivan, and Zuckerman, Herbert, ⊗4An Introduction to the Theory
of Numbers⊗*, John Wiley & Sons, Inc., New York, 1960.

Olson, Robert G., ⊗4Meaning and Argument⊗*, Harcourt, Brace & World,
New York, 1969.

Ore, Oystein, ⊗4Number Theory and its History⊗*, McGraw-Hill, 
New York, 1948.

Pietarinen, Juhani, ⊗4Lawlikeness, Analogy, and Inductive Logic⊗*,
North-Holland, Amsterdam, published as v. 26 of the series
Acta Philosophica Fennica (J. Hintikka, ed.), 1972.

@Poincare', Henri, ⊗4The Foundations of Science: Science and Hypothesis,
The Value of Science, Science and Method⊗*, The Science Press, New York,
1929. 
.COMMENT main library, 501  P751F, copy 4;

@Polya, George, ⊗4Mathematics and Plausible Reasoning⊗*, Princeton
University Press, Princeton, Vol. 1, 1954;  Vol. 2, 1954.

@Polya, George, ⊗4How To Solve It⊗*, Second Edition, Doubleday Anchor Books, 
Garden City, New York, 1957.

@Polya, George, ⊗4Mathematical Discovery⊗*, John Wiley & Sons,
New York, Vol. 1, 1962; Vol. 2, 1965.

Richardson, Robert P., and Edward H. Landis, ⊗4Fundamental Conceptions of
Modern Mathematics⊗*, The Open Court Publishing Company, Chicago, 1916.

Rosskopf, Steffe, Taback  (eds.), ⊗4Piagetian Cognitive-
Development Research and Mathematical Education⊗*,
National Council of Teachers of Mathematics, New York, 1971.

Rulison, Jeff, and... ⊗4QA4, A Procedural Frob...⊗*,
Technical Note..., Artificial Intelligence Center, SRI, Menlo
Park, California, ..., 1973.

Saaty, Thomas L., and Weyl, F. Joachim (eds.), ⊗4The Spirit and the Uses
of the Mathematical Sciences⊗*, McGraw-Hill Book Company, New York, 1969.

Schminke, C. W., and Arnold, William R., eds., ⊗4Mathematics is a Verb⊗*,
The Dryden Press, Hinsdale, Illinois, 1971.

Singh, Jagjit, ⊗4Great Ideas of Modern Mathematics⊗*, Dover Publications,
New York, 1959.

@Skemp, Richard R., ⊗4The Psychology of Learning Mathematics⊗*, 
Penguin Books, Ltd., Middlesex, England, 1971.

Slocum, Jonathan, ⊗4The Graph-Processing Language GROPE⊗*, U. Texas at Austin,
Technical Report NL-22, August, 1974.

Smith, Nancy Woodland, ⊗4A Question-Answering System for Elementary Mathematics⊗*,
Stanford Institute for Mathematical Studies in the Social Sciences, Technical
Report 227, April 19, 1974.

Smith, R.L., Nancy Smith, and F.L. Rawson, ⊗4CONSTRUCT: In Search of a Theory of
Meaning⊗*, Stanford IMSSS Technical Report 238, October 25, 1974.

Stein, Sherman K., ⊗4Mathematics: The Man-Made Universe: An Introduction
to the Spirit of Mathematics⊗*, Second Edition, W. H. Freeman and 
Company, San Francisco,  1969.

Stewart, B. M., ⊗4Theory of Numbers⊗*, The MacMillan Co., New York, 1952.

Stokes, C. Newton, ⊗4Teaching the Meanings of Arithmetic⊗*, 
Appleton-Century-Crofts, New York, 1951.

Suppes, Patrick, ⊗4A Probabilistic Theory 
of Causality⊗*, Acta Philosophica Fennica,
Fasc. 24, North-Holland Publishing Company, Amsterdam, 1970.

Teitelman, Warren, ⊗4INTERLISP Reference
Manual⊗*, XEROX PARC, 1974.

Tullock, Gordon,  ⊗4The Organization of Inquiry⊗*, Duke U. Press, Durham, N. C.,
1966.

Venn, John, ⊗4The Principles of Empirical or Inductive Logic⊗*,
MacMillan and Co., London, 1889.

Waismann, Friedrich, ⊗4Introduction to Mathematical Thinking⊗*, 
Frederick Ungar Publishing Co., New York, 1951.

Wickelgren, Wayne A., ⊗4How to Solve Problems: Elements of a Theory of Problems
and Problem Solving⊗*, W. H. Freeman and Co., Sanf Francisco, 1974.

Wilder, Raymond L., ⊗4Evolution of Mathematical Concepts⊗*, John Wiley & Sons,
Inc., NY, 1968.

Winston, P., (ed.),
"New Progress in Artificial Intelligence",
⊗4MIT AI Lab Memo AI-TR-310⊗*, June, 1974. 
Good summaries of work on Frames,
Demons, Hacker, Heterarchy, Dialogue, and Belief.

Wittner, George E., ⊗4The Structure of Mathematics⊗*, Xerox College Publishing,
Lexington, Mass, 1972.

Wright, Georg H. von, ⊗4A Treatise on Induction and Probability⊗*,
Routledge and Kegan Paul, London, 1951.

.END
.GROUP SKIP 2
.ASSEC(Articles)

.BEGIN FILL SINGLE SPACE  PREFACE 1 INDENT 0,4,0 TURN OFF "@"

Amarel, Saul, ⊗4On Representations of Problems of Reasoning about
Actions⊗*, Machine Intelligence 3, 1968, pp. 131-171.

Bledsoe, W. W., ⊗4Splitting and Reduction Heuristics in Automatic
Theorem Proving⊗*, Artificial Intelligence 2, 1971, pp. 55-77.

Bledsoe and Bruell, Peter, ⊗4A Man-Machine Theorem-Proving System⊗*,
Artificial Intelligence 5, 1974, 51-72.

Bourbaki, Nicholas, ⊗4The Architechture of Mathematics⊗*, American Mathematics
Monthly, v. 57, pp. 221-232, Published by the MAA, Albany, NY, 1950.

@Boyer, Robert S., and J. S. Moore, ⊗4Proving Theorems about LISP Functions⊗*,
JACM, V. 22, No. 1, January, 1975, pp. 129-144.

@Bruijn, N. G. de, ⊗4AUTOMATH, a language for mathematics⊗*, Notes taken by
Barry Fawcett, of Lecures given at the Seminare de mathematiques Superieurs,
University de Montreal, June, 1971. Stanford University Computer Science
Library report number is 005913.

@Buchanan, Feigenbaum, and Sridharan, ⊗4Heuristic Theory Formation⊗*,
Machine Intelligence 7, 1972, pp. 267-...

@Bundy, Alan, ⊗4Doing Arithmetic with Diagrams⊗*, 3rd IJCAI, 
1973, pp. 130-138.

Daalen, D. T. van, ⊗4A Description of AUTOMATH and some aspects of its language
theory⊗*, in the Proceedings of the SYmposium on APL, Paris, December, 1973,
P. Braffort (ed). This volume also contains other, more detailed articles on this
project, by  Bert Jutting and Ids Zanlevan.

Engelman, C., ⊗4MATHLAB: A Program for On-Line Assistance in Symbolic Computation⊗*,
in Proceedings of the FJCC, Volume 2, Spartan Books, 1965.

Engelman, C., ⊗4MATHLAB '68⊗*, in IFIP, Edinburgh, 1968.

Gardner, Martin, ⊗4Mathematical Games⊗*, Scientific American, numerous columns,
including especially:  February, 1975.

@Gelernter, H., ⊗4Realization of a Geometry-Theorem Proving Machine⊗*,
in (Feigenbaum and Feldman, eds.) ⊗4Computers and Thought⊗*, Part 1, Section 3,
pages 134-152, McGraw-Hill Book Co., New York, 1963.

Goldstine, Herman H., and J. von Neumann, ⊗4On the Principles of Large Scale
Computing Machines,⊗* pages 1:33 of Volumne 5 of A. H. Taub (ed), ⊗4The
Collected Works of John von Neumann⊗*, Pergamon Press, NY, 1963.

Guard, J. R., et al., ⊗4Semi-Automated Mathematics⊗*, JACM 16,
January, 1969, pp. 49-62.

Halmos, Paul R., ⊗4Innovation in Mathematics⊗*, in
Kline, M. (ed), ⊗4Mathematics in the Modern World: Readings from Scientific
American⊗*, W.H.Freeman and Co., San Francisco, 1968, pp. 6-13. Originally in
Scientific American, September, 1958.

Hasse, H., ⊗4Mathemakik als Wissenschaft, Kunst und Macht⊗*,
(Mathematics as Science, Art, and Power), Baden-Badeb, 1952.

@Hewitt, Carl, ⊗4A Universal Modular ACTOR Formalism for
Artificial Intelligence⊗*, Third International Joint Conference on
Artificial Intelligence,
1973, pp. 235-245.

Menges, Gunter, ⊗4Inference and Decision⊗*, 
A Volume in ⊗4Selecta Statistica Canadiana⊗*,
John Wiley & Sons, New York,  1973, pp. 1-16.

Kling, Robert E., ⊗4A Paradigm for Reasoning by Analogy⊗*,
Artificial Intelligence 2, 1971, pp. 147-178.

Knuth,Donald E., ⊗4Ancient Babylonian Algorithms⊗*,
CACM 15, July, 1972, pp. 671-677.

Lee, Richard C. T., ⊗4Fuzzy Logic and the Resolution Principle⊗*,
JACM 19, January, 1972, pp. 109-119.

@Lenat, D., ⊗4BEINGs: Knowledge as Interacting Experts⊗*, 4th IJCAI, 1975.

McCarthy, John, and Hayes, Patrick, ⊗4Some Philosophical Problems
from the Standpoint of Artificial Intelligence⊗*, Machine Intelligence
4, 1969, pp. 463-502.

Martin, W., and Fateman, R., ⊗4The MACSYMA System⊗*, Second
Symposium on Symbolic and Algebraic Manipulation, 1971, pp. 59-75.

@Minsky, Marvin, ⊗4Frames⊗*, in (Winston) ⊗4Psychology of Computer
Vision⊗*, 1974.

Moore, J., and Newell, ⊗4How Can Merlin Understand?⊗*, Carnegie-Mellon University
Department of Computer Science "preprint", November 15, 1973.

Neumann, J. von, ⊗4The Mathematician⊗*, in R.B. Heywood (ed), ⊗4The Works
of the Mind⊗*, U. Chicago Press, pp. 180-196, 1947.

Neumann, J. von, ⊗4The Computer and the Brain⊗*, Silliman Lectures, Yale U. Press,
1958.

Pager, David, ⊗4A Proposal for a Computer-based Interactive Scientific
Community⊗*, CACM 15, February, 1972, pp. 71-75.

Pager, David, ⊗4On the Problem of Communicating Complex Information⊗*,
CACM 16, May, 1973, pp. 275-281.

@Sloman, Aaron, ⊗4Interactions Between Philosophy and Artificial 
Intelligence: The Role of Intuition and Non-Logical Reasoning in
Intelligence⊗*, Artificial Intelligence 2, 1971, pp. 209-225.

Sloman, Aaron, ⊗4On Learning about Numbers⊗*,...

Winston, Patrick, ⊗4Learning Structural Descriptions
from Examples⊗*, Ph.D. thesis, Dept. of Electrical Engineering,
TR-76, Project MAC, TR-231, MIT AI Lab, September, 1970.

.END


.ASSEC(Acknowledgements)

I owe a great debt of thanks to many people, both for the
input of new ideas and for the evaluation, channelling, and pruning of
my own. Let me mention, alphabetically:
B. Buchanan, A. Cohn, R. Floyd, C. Green, D. Knuth, M. Lenat,
E. Sacerdoti, R. Waldinger,
R. Weyrauch,
and M. Wilber.
Let me also thank SRI, for providing computer time for this research.

.ONCE TURN OFF "@"
The application of BEINGs to an Automatic Programming task is described
in [Lenat].  The problems with the domain of concept-formation-program-writing,
studied therein and summarized here in Appendix 1,
led to the AM project. 
A more complete description of AM can be perused as SYS4[TLK,DBL]@SU-AI.
The full body of knowledge we expect to provide to AM  is found on 
file GIVEN[AM,AJC]@SU-AI.

.ASEC(Appendix 1: Background for readers unfamiliar with Beings)

This appendix introduces the reader to the concept of organizing knowledge
as similarly-structured modules, called BEINGs. It should be skipped by those
familiar with that construction. For a more thorough treatment,
read either:
Section 4.6 of Green et al.,
⊗4Progress Report on Program-Understanding Systems⊗*, Memo AIM-240,
CS Report STAN-CS-74-444, Artificial Intelligence Laboratory, Stanford
University, August, 1974; or 
Lenat, ⊗4BEINGS: Knowledge as
Interacting Experts⊗*, 4th IJCAI, 1975 (preprint available from the author),
or Lenat, ⊗4Synthesis of Large Programs from Specific Dialogues⊗*,
3rd World Conference on Cybernetics and Systems, Rumania, 1975 (preprint).

.ASSEC(BEINGs and Experts)

Consider an interdisciplinary enterprise, attempted by a community of human
experts who are specialists in -- and only in -- their own fields.  What modes of 
i}teractions will be productive?  The dominant paradigm might well settle into
⊗4questioning and answering⊗* each other.
Instead of a chairman, suppose the group adopts rules for
gaining the floor, what a speaker may do,  and how to resolve disputes.
When a topic is being considered, one or two
experts might recognize it and speak up. In the course of their exposition
they might need to call on other specialists. This might be by name, by specialty,
or simply by posing a new sub-question and hoping someone could recognize his own
relevance and volunteer a suggestion.
If the task is to construct something, then the
activities of the experts should not be strictly verbal.  Often, one will 
recognize his relevance to the current situation and ask to ⊗4do⊗* something:
clarify or modify or (rarely) create.

What would it mean to ⊗4simulate⊗* the above activity?  Imagine several little 
programs, each one modelling a different expert. What should each program,
called a ⊗4BEING⊗*, be capable of?  It must possess a corpus of specific facts and
strategies for its designated speciality. It must interact via questioning and
answering other BEINGs. Each BEING should be able to recognize when it is relevant.
It must set up and alter structures, just as the human specialists do.

Let us return to our meeting of human experts.
To be more concrete, suppose their task is to design and code a large
computer program: a concept formation system[Hempel]. Experts who will be useful
include scientific programmers, non-programming psychologists,
system hackers, and management personnel.
Within each of these four major families will be many individual specialists.
What happens in the ensuing session?  When an expert participates, he will
either be aiding a colleague in some difficulty
or else transferring a tiny, customized 
bit of his expertise (facts about his field) into a programmed function
which can do something.  The final code reflects the members' knowledge,
in that sense.
Of course, experts within the same family will be able to communicate things
among themselves which are unintelligible to outsiders (e.g., when the hackers
start arguing about how to patch the system bugs that appear). 
Nevertheless, if we press him,
any of these specialists could transform his compressed jargon into a more universal
message (losing some information and some efficency), 
by giving examples and analogies, perhaps.

Suppose the project sponsor is quasi-active, submitting an initial specification
order for
the program, and then participating in the work as a (somewhat priveleged) member
of the team. This individual is the one who wants the final product, hence will be
called the ⊗4user⊗*.

How could BEINGs do all this? 
There would be some little program containing information
about ⊗7CONCEPT-FORMATION⊗*
(much more than would be used in writing any single concept formation program),
another BEING who knows
how to manage a group to
⊗7WRITE-PROGRAMS⊗*, and many lower-level specialists, for example 
⊗7INFO-OBTAINER, TEST, MODIFY-DATA-STRUCTURE, UNTIL-LOOP, 
VISUAL-PERCEPTION, AVOID-CONTRADICTION, PROPOSE-PLAUSIBLE-NAME⊗*.
Like the human specialists,
the BEINGs would contain far too much information, far too
inefficiently represented, to be able to say "we ourselves constitute
the desired program!"
They would have to discuss, and perhaps carry out, the concept formation task. They
would write specialized versions of themselves, programs which could do exactly what
the BEINGs did to carry out the task, no more nor less (although they would
hopefully take much less time, be more customized).
Some BEINGs 
(e.g., ⊗7TEST⊗*) may have several
distinct, streamlined fractions of themselves in the final program. BEINGs which
only aided other BEINGs (e.g., ⊗7PROPOSE-PLAUSIBLE-NAME⊗*)
may not have ⊗4any⊗* new correlates in the 
synthesized code.

An experimental system, PUP6, was designed and partially implemented. PUP6
synthesized a concept formation program (similar to Winston's), 
but the user, who is human,  must 
come up with certain specific answers to some of the BEINGs' critical queries.
A grammatical inference program and a  simple property list maintenance routine
were also generated. Only
a few new BEINGs had to be added to PUP6's orginal pool of 100
BEINGs in order to synthesize them, but communication
flexibility problems existed.
The choice of mathematics as the domain for the proposed system was made partially
to alleviate this problem.

.SKIP 2

.ASSEC(Internal Design of BEINGs)

Now that we have developed our "external specifications" for what the 
BEINGs must do, how
exactly will they do it?  Have we merely pushed the problem of Artificial
Intelligence down into the coding of the BEINGs?  Perhaps not, for we still have
our analogy to the interacting experts. Let us carry it further, analyze
synergetic cooperation among the humans, then try to model that in our
internal design of BEINGs.

Viewing the group of experts as a single entity, what makes it
productive? The members must be very different in abilities, in order to handle
a complex task, yet similar in basic cognitive structure 
(in the anatomy of their minds) to
permit facile communications to flow.
For example, each psychologist knows how to direct a programmer to do
some of the things he can do, but the specific facts he has tucked away
under this category must be quite unique. Similarly, each expert may have
a set of strategies for
recognizing his own relevance to a
proposed question, but the ⊗4contents⊗* of that knowledge varies from
individual to individual.  The proposed hypothesis is that all the experts can be
said to consist of categorized information, where the set of 
categories is fairly standard, and indicates the ⊗4types⊗* of questions
any expert can be expected to answer. An expert is considered ⊗4equivalent⊗*
to his answers to several standard questions.
Each expert has the same mental "parts", it
is only the values stored in these parts, their contents,
which distinguish him as an individual. 
The particular set of questions he can deal with is fixed, depending on which family
the expert belongs to. There is much -- but not total -- overlap between what two
humans from different professions can meaningly answer.

Armed with this dubious view of intelligence, let us return to the design of
BEINGs. Each BEING shall have many parts, each possessing a name (a question it
can deal with) and a value (a procedure capable of answering that question).
Henceforth, "⊗4part⊗*" will be used in this technical sense.
When a BEING asks a question, it is really just one
part who is asking. In fact, it must be that the ⊗4value⊗* subpart of some part
can't answer ⊗4his⊗* question without further assistance. He may not know
enough to call on specific other
BEINGs (in which case he broadcasts his plea, lettting anyone 
respond who feels relevant), but
he should ⊗4always⊗* specify what BEING ⊗4part⊗* the question should be answered by.
By analogy with the experts, each BEING in the same family 
will have the same fixed 
set of types of parts (will answer the same kinds of queries), and this uniformity 
should permit painless intercommunication
between specialists in the same profession.  Many of these parts will be common to
more than one family (e.g., "How long-winded are you").

Since the paradigm of
the meeting is questioning and anwwering, the names of the parts should
cover all the types of questions one expert wants to ask another. Each part of
each BEING will have implicit access to this list: it may ask only these
types of questions. Each BEING should ⊗4not⊗* have access to the list of all
BEINGs in the system: requests should be phrased in terms of what is wanted;
rarely is the name of the answerer specified in advance.
(By analogy: the human speaker is not aware of precisely who is in the room;
when he feels inadequate, he asks for help and hopes someone responds).

Once again: the concept of a system of BEINGs is that many entities coexist, 
clumped into a few major family groupings. Each individual BEING has
a complex structure, but that structure does not vary much from BEING to BEING;
it does not vary at all among BEINGs of the same family.
This idea has analogues in many fields: transactional analysis in
psychology, anatomy in medicine, modular design in architechture.

To carry out these ideas, we build a system of BEINGs, a modular program
which will interact with a human user and go through the same conversations,
and arrive at the same end products, that our human experts would.
Recasting the idea into operational terms, we arrive at this procedure for
writing a pool of BEINGs: 


(1) Study the task which the pool is to do. See
what kinds of questions are asked by simulated experts, and notice how the experts
divide into a few major families {f↓i}.
The total number of families is important: if there are too many, it is hard for
specialized communication to occur; if too few families, many BEINGs will be forced
to answer questions they consider irrelevant.

(2) Distill the corpus of collected communications into
a core of simple questions, Q↓f, for each family f,
in such a way that each inter-expert question or transfer
of control can be rephrased in terms of these Q's.
The sizes of the sets Q are very important.
If a Q is huge, addition of new
BEINGs will demand either great effort or great intelligence (an example of a
system like this is ACTORS). If a Q is too small, all the non-uniformity is simply
pushed down into the values of one or two general 
catchall questions (all first-order
logical languages do this). 

(3) List all the BEINGs who will be
present in the pool, by family, and fill in their parts. 
The time to encode knowledge into many simple representation schemes is
proportional to the square of (occasionally exponential in) 
the amount of interrelated knowledge (e.g., consider the frame problem).
The filling in of a new BEING is  ⊗4independent⊗* of
the number of BEINGs already in the pool, because BEINGs can communicate
via nondeterministic goal mechanisms, and not have to know the names of the BEINGs
who will answer their queries. This filling in is  of course linear in the number of
questions a BEING must answer (e.g., the maximum size of any Q↓f).

(4) The human user interacts with the completed
BEING community, until the desired task is complete.

.SKIP 2

.ASSEC(BEINGs Interacting)

We now have some idea of what the internal structure of BEINGs are, but how do they
meet the external specifications we set for them: the reproduction of the human
experts' conversations and finished products?  The question is that of control in
the system, and it splits into two parts: how does the "right" BEING gain control,
and what does he "do" when he gets control?

The scenario, in PUP6, runs as follows. 
When control is about to be passed, the relinquisher
will specify a set of possible recipients (successors). 
Two extreme but common cases are
a singleton (he knows who should go next) and the set of all BEINGs (he has no
idea whatsoever).  Each possible candidate for control is asked if he is relevant
to the current situation.(If a goal is currently set forth, his EFFECTS part will
be asked if this guy can bring about that goal; if an unintelligible piece of
information is sitting around somewhere, his IDENTIFY part will be asked if this
guy can recognize that piece of info.) If more than one BEING replies that it feels
relevant, then their WHEN components are asked ⊗4how⊗* relevant they are right now.
If a tie still exists, their COMPLEXITY components are asked to decide which will be
faster, surer, lead to auxilliary desired effects, etc.
There will always be ⊗4some⊗* BEING who will take over;
the general management
types of BEINGs are always able  -- but reluctant  -- to do so. 

Once in control, a BEING B picks one of its parts, evaluates it, and repeats this
process until it decides to relinquish control. At that time, it puts forth a
list of possible successors.
For example, the ARGS
part might be first; if it asks for some arguments which no BEING has 
supplied, then the whole BEING might decide to fail. Some  parts, when evaluated,
might create a new BEING, might ask questions which require this whole process
to repeat recursively, etc. 
This "asking" really means broadcasting a request to one or two parts of
some other BEINGs (often every other BEING); 
for example "Is there a known fast way of gronking toves?" would
be asked as a search for a BEING whose COMPLEXITY indicated speed, and whose
EFFECTS part contained a production with a template matching "gronking toves".
A list of the responders would be returned. 
The questioner might pose some
new questions directly to these BEINGs, might turn control over to them directly,
might simply want to know that some exist, etc.

How does each BEING decide which parts to evaluate, and in which order,
once it gains control?
For our humans, the answer is: a combination of individual intelligence, the
training inherent in the family of expert you are, and universally accepted
constraints (common sense). For our BEINGs, we postulate a part called
ORDERING; each BEING consults its own Ordering part, the Ordering part
for its Family, and the universal Ordering Archetypical BEING. They 
partially constrain what part must be evaluated before what other part.
This appears to be difficult or tedious for whoever writes
BEINGs, since it might vary from BEING to BEING. In fact,
it rarely does vary, and most of the necessary constraints can be learned by
the system as it runs, and inserted into the proper slots.

Reexamine the question:
"What parts are evaluated, and in what order, when a particular
BEING gains control?" This decision depends primarily on the ⊗4types⊗* of parts
present in the BEING, not on their ⊗4values⊗*.  But every BEING in a family 
has the same
anatomy, so one single algorithm,
located in that family's Ordering part,
can assemble any BEING's parts into an executable
LISP function. Moreover, this assemby can be done when the system is first
loaded (or when a new BEING is first created), and need only be redone for a
BEING when the values of its parts change. Such changes are rare: experts are
not often open-minded. Thus the family's BEINGs can be compiled into executable
LISP functions.

.SKIP 2

.ASSEC(Aspects of BEINGs Systems)

It would be aesthetically pleasing to postulate that the only entities which exist
are BEINGs. Since this would require BEINGs' parts to be BEINGs, hence have parts of
their own, etc., an explosive recurrence would occur.  To avoid this, we set a slightly
different tack. Suppose that each part which has the same name must also have the same
internal structure. The format for part P is stored in the Representation part of the
archetypical BEING named P. The only allowable formats are the following:
an opaque executable expression, a pointer to some BEING or some specific part of a
BEING, a list of executable forms and pointers to BEINGs. Notice that there are
only three, and that they are all quite simple.

We shall demand that BEINGs write only new BEINGs, never any new functions,
production systems, etc. The humans often succeeded by distilling a tiny
specialization of their expertise; the BEINGs work similarly. 
In the process of discovery, this splitting occurs usually when some subpart is more
interesting than the whole.
In the process of automatic code-writing, this creation occurred when a BEING knew
how to write a fast, short, specialized, streamlined version of itself which was
capable of executing some specific subprocess used in the final "target" concept
formation program.


To clarify what BEINGs are and are not, they are contrasted with some other ideas. 
BEINGs linearly but
inefficently subsume such constructions as demons, functions, and assertions in an
associative data base
(in the earlier papers, brief demonstrations were provided).
FRAMES are sufficiently amorphous to subsume BEINGs. In philosophy,
FRAMES are meant to model perception, and intentionally rely on implicit
default values; BEINGs avoid making decisions without full awareness of the 
justification. 
This is also the difference  between HACKER and PUP6, the first experimental pool of
BEINGs. 
Since PUP6 wrote structured programs, it should be distinguished from macro
expansion. Macro procedures expand mechanically:
@2expand(sequence   m↓1  m↓2) = (sequence  
expand(m↓1)  expand(m↓2)))⊗*. BEINGs could use
information gleaned during expansion of m↓1 to improve the way m↓2 was handled.
ACTORs, unlike BEINGs, have no fixed structure imposed, and do not broadcast
their messages (they  specify who gets each message, by name, to a bureaucracy).


The performance of the BEINGs representation itself in PUP6 is mixed.
Two advantages were hoped for by using a uniform set of BEING parts.
Addition of new BEINGs to the pool was not easy (for untrained users)
but communication among
BEINGs ⊗4was⊗* easy (fast, natural). Two
advantages were hoped for by keeping the BEINGs highly structured.
The interactions (especially with the user) were
brittle, but
the complex tasks put to the system ⊗4were⊗* successfully completed.

The crippling problems are seen to be with user-system communication,
not with the BEINGs ideas themselves.
Sophisticated, bug-free programs ⊗4were⊗* generated, after hours of fairly high
level dialogue with an active user, after tens of thousands of messages passed
among the BEINGs.
Part of this success is attributed to distributing
the responsibility for writing code and for recognizing relevance, to a hundred
entities, rather than having a few central monitors worry about everything.
The standardization of parts made filling in the BEINGs' contents fairly painless,
both for the author and for BEINGs who had to write new BEINGs.

All this suggests two possible continuations, both of which are underway here
at the Stanford AI Lab. One is to
rethink the communication problems,
and develop a new system for the
concept formation program synthesis task. The earliest programs by our Automatic
Programming group had the goal "synthesize the target program somehow";
the later PUP1-6 research insisted on "getting the
target program by going through one ⊗4proper⊗* sequence of reasoning steps"; 
the group's proposed continuation wants several
untrained users to  succeed by many different "proper" routes.

This document is proposes an alternative direction for research effort.
This other way of continuing
is to find a task where BEINGs are well-suited, where
the problems encountered in PUP6 won't recur. What ⊗4are⊗* BEINGs good for?
The idea of a fixed set of parts (which distinguishes them from ACTORs) is
useful if the mass of knowledge is 
too huge for one individual to keep "on top" of.
It then should be organized in a
very uniform way (to simplify preparing it for storage), 
yet it must also be highly structured
(to speed up retrieval). 
A final ideal would be to find a domain where slightly-trained users could work
naturally, without (them ⊗4or⊗* us) having to battle the staggering complexities
of natural language handling. 
For these reasons, the author has chosen ⊗4fundamental mathematics⊗* 
as a task domain.
BEINGs are big and slow, but valuable for organizing knowledge in ways 
meaningful to how it will be used. In this proposed system, BEINGs will be one
-- but not the only -- 
internal mechanism for representing and manipulating knowledge.

.COMMENT table of contents;

.EVERY HEADING(,,)
.PORTION CONTENTS
.NOFILL
.NARROW 6,8
.BEGIN CENTER
.GROUP SKIP 3
.SELECT 5

AUTOMATED THEORY FORMATION

IN MATHEMATICS




.SELECT 2
.TURN ON "{∞→"



Douglas B. Lenat



Artificial Intelligence Laboratory

Stanford University








⊗4Description as of:  {DATE}⊗*





⊗7Ph.D. Dissertation Research Proposal⊗*

.SKIP TO COLUMN 1
⊗5Table of Contents⊗*


.END

.PREFACE 10 MILLS
.TURN ON "{∞→"
.NARROW 7,7
.RECEIVE

.PAGE←0
.SELECT 1